Overview

Dataset statistics

Number of variables27
Number of observations45346
Missing cells239390
Missing cells (%)19.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.3 MiB
Average record size in memory216.0 B

Variable types

Numeric10
Categorical17

Alerts

title has a high cardinality: 42196 distinct valuesHigh cardinality
overview has a high cardinality: 44231 distinct valuesHigh cardinality
original_language has a high cardinality: 89 distinct valuesHigh cardinality
tagline has a high cardinality: 20268 distinct valuesHigh cardinality
name_btc has a high cardinality: 1078 distinct valuesHigh cardinality
poster_btc has a high cardinality: 1078 distinct valuesHigh cardinality
backdrop_btc has a high cardinality: 1077 distinct valuesHigh cardinality
iso_639_1 has a high cardinality: 1916 distinct valuesHigh cardinality
language_name has a high cardinality: 1827 distinct valuesHigh cardinality
companies_id has a high cardinality: 22290 distinct valuesHigh cardinality
companies_name has a high cardinality: 22240 distinct valuesHigh cardinality
countries_iso has a high cardinality: 2383 distinct valuesHigh cardinality
countries_name has a high cardinality: 2383 distinct valuesHigh cardinality
release_date has a high cardinality: 17333 distinct valuesHigh cardinality
popularity is highly overall correlated with vote_countHigh correlation
vote_count is highly overall correlated with popularity and 1 other fieldsHigh correlation
budget is highly overall correlated with revenue and 1 other fieldsHigh correlation
revenue is highly overall correlated with vote_count and 2 other fieldsHigh correlation
return is highly overall correlated with budget and 1 other fieldsHigh correlation
status is highly imbalanced (97.0%)Imbalance
original_language is highly imbalanced (67.4%)Imbalance
iso_639_1 is highly imbalanced (62.0%)Imbalance
language_name is highly imbalanced (62.0%)Imbalance
countries_iso is highly imbalanced (57.7%)Imbalance
countries_name is highly imbalanced (57.7%)Imbalance
overview has 946 (2.1%) missing valuesMissing
tagline has 24960 (55.0%) missing valuesMissing
id_btc has 42183 (93.0%) missing valuesMissing
name_btc has 42183 (93.0%) missing valuesMissing
poster_btc has 42183 (93.0%) missing valuesMissing
backdrop_btc has 42183 (93.0%) missing valuesMissing
iso_639_1 has 3792 (8.4%) missing valuesMissing
language_name has 3915 (8.6%) missing valuesMissing
companies_id has 12264 (27.0%) missing valuesMissing
companies_name has 12264 (27.0%) missing valuesMissing
countries_iso has 6213 (13.7%) missing valuesMissing
countries_name has 6213 (13.7%) missing valuesMissing
popularity is highly skewed (γ1 = 29.21542294)Skewed
return is highly skewed (γ1 = 138.283787)Skewed
title is uniformly distributedUniform
overview is uniformly distributedUniform
tagline is uniformly distributedUniform
id has unique valuesUnique
vote_average has 2944 (6.5%) zerosZeros
vote_count has 2846 (6.3%) zerosZeros
runtime has 1781 (3.9%) zerosZeros
budget has 36470 (80.4%) zerosZeros
revenue has 37949 (83.7%) zerosZeros
return has 40033 (88.3%) zerosZeros

Reproduction

Analysis started2023-07-01 12:46:37.275489
Analysis finished2023-07-01 12:47:50.457230
Duration1 minute and 13.18 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

id
Real number (ℝ)

Distinct45346
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean108042.22
Minimum2
Maximum469172
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size354.4 KiB
2023-07-01T07:47:50.836208image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5340.25
Q126390.25
median59852.5
Q3156601.5
95-th percentile357370.75
Maximum469172
Range469170
Interquartile range (IQR)130211.25

Descriptive statistics

Standard deviation112187.33
Coefficient of variation (CV)1.0383656
Kurtosis0.55836782
Mean108042.22
Median Absolute Deviation (MAD)44405
Skewness1.2828454
Sum4.8992825 × 109
Variance1.2585996 × 1010
MonotonicityNot monotonic
2023-07-01T07:47:51.336453image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
862 1
 
< 0.1%
202198 1
 
< 0.1%
124026 1
 
< 0.1%
300168 1
 
< 0.1%
132316 1
 
< 0.1%
74458 1
 
< 0.1%
40777 1
 
< 0.1%
188222 1
 
< 0.1%
328483 1
 
< 0.1%
107637 1
 
< 0.1%
Other values (45336) 45336
> 99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
16 1
< 0.1%
ValueCountFrequency (%)
469172 1
< 0.1%
468707 1
< 0.1%
468343 1
< 0.1%
467731 1
< 0.1%
465044 1
< 0.1%
464819 1
< 0.1%
464207 1
< 0.1%
464111 1
< 0.1%
463906 1
< 0.1%
463800 1
< 0.1%

title
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct42196
Distinct (%)93.1%
Missing0
Missing (%)0.0%
Memory size354.4 KiB
Cinderella
 
11
Hamlet
 
9
Alice in Wonderland
 
9
Beauty and the Beast
 
8
Les Misérables
 
8
Other values (42191)
45301 

Length

Max length105
Median length79
Mean length16.702289
Min length1

Characters and Unicode

Total characters757382
Distinct characters287
Distinct categories17 ?
Distinct scripts7 ?
Distinct blocks12 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39892 ?
Unique (%)88.0%

Sample

1st rowToy Story
2nd rowJumanji
3rd rowGrumpier Old Men
4th rowWaiting to Exhale
5th rowFather of the Bride Part II

Common Values

ValueCountFrequency (%)
Cinderella 11
 
< 0.1%
Hamlet 9
 
< 0.1%
Alice in Wonderland 9
 
< 0.1%
Beauty and the Beast 8
 
< 0.1%
Les Misérables 8
 
< 0.1%
Treasure Island 7
 
< 0.1%
The Three Musketeers 7
 
< 0.1%
A Christmas Carol 7
 
< 0.1%
Bluebeard 6
 
< 0.1%
The Hound of the Baskervilles 6
 
< 0.1%
Other values (42186) 45268
99.8%

Length

2023-07-01T07:47:51.899606image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 14544
 
10.7%
of 4923
 
3.6%
a 2238
 
1.6%
in 1693
 
1.2%
and 1629
 
1.2%
to 1053
 
0.8%
756
 
0.6%
man 665
 
0.5%
love 664
 
0.5%
for 601
 
0.4%
Other values (24353) 107329
78.9%

Most occurring characters

ValueCountFrequency (%)
90771
 
12.0%
e 76195
 
10.1%
a 48911
 
6.5%
o 45636
 
6.0%
n 40797
 
5.4%
r 39993
 
5.3%
i 39748
 
5.2%
t 36706
 
4.8%
s 29500
 
3.9%
h 28499
 
3.8%
Other values (277) 280626
37.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 533789
70.5%
Uppercase Letter 117198
 
15.5%
Space Separator 90771
 
12.0%
Other Punctuation 10485
 
1.4%
Decimal Number 3845
 
0.5%
Dash Punctuation 980
 
0.1%
Close Punctuation 87
 
< 0.1%
Open Punctuation 85
 
< 0.1%
Final Punctuation 38
 
< 0.1%
Other Letter 25
 
< 0.1%
Other values (7) 79
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 76195
14.3%
a 48911
9.2%
o 45636
 
8.5%
n 40797
 
7.6%
r 39993
 
7.5%
i 39748
 
7.4%
t 36706
 
6.9%
s 29500
 
5.5%
h 28499
 
5.3%
l 25904
 
4.9%
Other values (121) 121900
22.8%
Uppercase Letter
ValueCountFrequency (%)
T 16010
13.7%
S 10332
 
8.8%
M 8029
 
6.9%
B 7653
 
6.5%
C 7157
 
6.1%
A 6782
 
5.8%
D 6330
 
5.4%
L 5869
 
5.0%
H 5170
 
4.4%
W 5162
 
4.4%
Other values (65) 38704
33.0%
Other Letter
ValueCountFrequency (%)
چ 2
 
8.0%
ه 2
 
8.0%
ک 2
 
8.0%
ی 2
 
8.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
ª 1
 
4.0%
Other values (11) 11
44.0%
Other Punctuation
ValueCountFrequency (%)
: 3714
35.4%
' 2505
23.9%
. 1603
15.3%
, 1133
 
10.8%
! 647
 
6.2%
& 458
 
4.4%
? 269
 
2.6%
/ 79
 
0.8%
* 19
 
0.2%
# 13
 
0.1%
Other values (8) 45
 
0.4%
Decimal Number
ValueCountFrequency (%)
2 861
22.4%
1 695
18.1%
0 616
16.0%
3 482
12.5%
9 229
 
6.0%
4 228
 
5.9%
5 224
 
5.8%
7 193
 
5.0%
8 161
 
4.2%
6 156
 
4.1%
Math Symbol
ValueCountFrequency (%)
+ 17
70.8%
× 3
 
12.5%
1
 
4.2%
= 1
 
4.2%
1
 
4.2%
1
 
4.2%
Other Number
ValueCountFrequency (%)
½ 12
63.2%
² 3
 
15.8%
³ 2
 
10.5%
1
 
5.3%
1
 
5.3%
Other Symbol
ValueCountFrequency (%)
° 3
37.5%
2
25.0%
1
 
12.5%
1
 
12.5%
1
 
12.5%
Currency Symbol
ValueCountFrequency (%)
$ 18
85.7%
¢ 2
 
9.5%
£ 1
 
4.8%
Dash Punctuation
ValueCountFrequency (%)
- 965
98.5%
15
 
1.5%
Close Punctuation
ValueCountFrequency (%)
) 82
94.3%
] 5
 
5.7%
Open Punctuation
ValueCountFrequency (%)
( 80
94.1%
[ 5
 
5.9%
Final Punctuation
ValueCountFrequency (%)
37
97.4%
1
 
2.6%
Initial Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
90771
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Format
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 650472
85.9%
Common 106370
 
14.0%
Cyrillic 346
 
< 0.1%
Greek 170
 
< 0.1%
Arabic 11
 
< 0.1%
Katakana 8
 
< 0.1%
Han 5
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 76195
 
11.7%
a 48911
 
7.5%
o 45636
 
7.0%
n 40797
 
6.3%
r 39993
 
6.1%
i 39748
 
6.1%
t 36706
 
5.6%
s 29500
 
4.5%
h 28499
 
4.4%
l 25904
 
4.0%
Other values (107) 238583
36.7%
Common
ValueCountFrequency (%)
90771
85.3%
: 3714
 
3.5%
' 2505
 
2.4%
. 1603
 
1.5%
, 1133
 
1.1%
- 965
 
0.9%
2 861
 
0.8%
1 695
 
0.7%
! 647
 
0.6%
0 616
 
0.6%
Other values (50) 2860
 
2.7%
Cyrillic
ValueCountFrequency (%)
о 32
 
9.2%
е 32
 
9.2%
а 29
 
8.4%
н 24
 
6.9%
и 23
 
6.6%
р 22
 
6.4%
к 17
 
4.9%
с 15
 
4.3%
в 14
 
4.0%
т 14
 
4.0%
Other values (38) 124
35.8%
Greek
ValueCountFrequency (%)
α 20
 
11.8%
ο 14
 
8.2%
ι 14
 
8.2%
τ 9
 
5.3%
λ 8
 
4.7%
ά 8
 
4.7%
ρ 8
 
4.7%
ν 7
 
4.1%
π 6
 
3.5%
ς 6
 
3.5%
Other values (32) 70
41.2%
Katakana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Arabic
ValueCountFrequency (%)
چ 2
18.2%
ه 2
18.2%
ک 2
18.2%
ی 2
18.2%
س 1
9.1%
ا 1
9.1%
ج 1
9.1%
Han
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 755820
99.8%
None 1121
 
0.1%
Cyrillic 346
 
< 0.1%
Punctuation 62
 
< 0.1%
Arabic 11
 
< 0.1%
Katakana 8
 
< 0.1%
CJK 5
 
< 0.1%
Misc Symbols 3
 
< 0.1%
Letterlike Symbols 2
 
< 0.1%
Math Operators 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
90771
 
12.0%
e 76195
 
10.1%
a 48911
 
6.5%
o 45636
 
6.0%
n 40797
 
5.4%
r 39993
 
5.3%
i 39748
 
5.3%
t 36706
 
4.9%
s 29500
 
3.9%
h 28499
 
3.8%
Other values (76) 279064
36.9%
None
ValueCountFrequency (%)
é 216
19.3%
ä 127
 
11.3%
ö 55
 
4.9%
è 53
 
4.7%
ô 44
 
3.9%
ü 39
 
3.5%
ó 37
 
3.3%
ı 35
 
3.1%
á 35
 
3.1%
í 33
 
2.9%
Other values (108) 447
39.9%
Punctuation
ValueCountFrequency (%)
37
59.7%
15
24.2%
5
 
8.1%
2
 
3.2%
1
 
1.6%
1
 
1.6%
1
 
1.6%
Cyrillic
ValueCountFrequency (%)
о 32
 
9.2%
е 32
 
9.2%
а 29
 
8.4%
н 24
 
6.9%
и 23
 
6.6%
р 22
 
6.4%
к 17
 
4.9%
с 15
 
4.3%
в 14
 
4.0%
т 14
 
4.0%
Other values (38) 124
35.8%
Arabic
ValueCountFrequency (%)
چ 2
18.2%
ه 2
18.2%
ک 2
18.2%
ی 2
18.2%
س 1
9.1%
ا 1
9.1%
ج 1
9.1%
Misc Symbols
ValueCountFrequency (%)
2
66.7%
1
33.3%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Letterlike Symbols
ValueCountFrequency (%)
1
50.0%
1
50.0%
Math Operators
ValueCountFrequency (%)
1
50.0%
1
50.0%
Katakana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Arrows
ValueCountFrequency (%)
1
100.0%

overview
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct44231
Distinct (%)99.6%
Missing946
Missing (%)2.1%
Memory size354.4 KiB
Nooverviewfound.
 
133
NoOverview
 
7
Nomovieoverviewavailable.
 
3
AdaptationoftheJaneAustennovel.
 
3
Afewfunnylittlenovelsaboutdifferentaspectsoflife.
 
3
Other values (44226)
44251 

Length

Max length851
Median length666
Mean length269.13806
Min length1

Characters and Unicode

Total characters11949730
Distinct characters423
Distinct categories22 ?
Distinct scripts13 ?
Distinct blocks21 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44201 ?
Unique (%)99.6%

Sample

1st rowLedbyWoody,Andy'stoyslivehappilyinhisroomuntilAndy'sbirthdaybringsBuzzLightyearontothescene.AfraidoflosinghisplaceinAndy'sheart,WoodyplotsagainstBuzz.ButwhencircumstancesseparateBuzzandWoodyfromtheirowner,theduoeventuallylearnstoputasidetheirdifferences.
2nd rowWhensiblingsJudyandPeterdiscoveranenchantedboardgamethatopensthedoortoamagicalworld,theyunwittinglyinviteAlan--anadultwho'sbeentrappedinsidethegamefor26years--intotheirlivingroom.Alan'sonlyhopeforfreedomistofinishthegame,whichprovesriskyasallthreefindthemselvesrunningfromgiantrhinoceroses,evilmonkeysandotherterrifyingcreatures.
3rd rowAfamilyweddingreignitestheancientfeudbetweennext-doorneighborsandfishingbuddiesJohnandMax.Meanwhile,asultryItaliandivorcéeopensarestaurantatthelocalbaitshop,alarmingthelocalswhoworryshe'llscarethefishaway.Butshe'slessinterestedinseafoodthansheisincookingupahottimewithMax.
4th rowCheatedon,mistreatedandsteppedon,thewomenareholdingtheirbreath,waitingfortheelusive"goodman"tobreakastringofless-than-stellarlovers.FriendsandconfidantsVannah,Bernie,GloandRobintalkitallout,determinedtofindabetterwaytobreathe.
5th rowJustwhenGeorgeBankshasrecoveredfromhisdaughter'swedding,hereceivesthenewsthatshe'spregnant...andthatGeorge'swife,Nina,isexpectingtoo.Hewasplanningonsellingtheirhome,butthat'saplanthat--likeGeorge--willhavetochangewiththearrivalofbothagrandchildandakidofhisown.

Common Values

ValueCountFrequency (%)
Nooverviewfound. 133
 
0.3%
NoOverview 7
 
< 0.1%
Nomovieoverviewavailable. 3
 
< 0.1%
AdaptationoftheJaneAustennovel. 3
 
< 0.1%
Afewfunnylittlenovelsaboutdifferentaspectsoflife. 3
 
< 0.1%
Whenfourwomenmoveintoanoldhouseleftbyonewoman'saunt,strangethingsbegintohappen.Bizarrevoices,visionsofghosts,andmysteriousnoisesleadthemtodiscoverthedarkestpowersofevilandahorrorandagonybeyondterror. 2
 
< 0.1%
AdventurerAllanQuartermainleadsanexpeditionintounchartedAfricanterritoryinanattempttolocateanexplorerwhowentmissingduringhissearchforthefableddiamondminesofKingSolomon. 2
 
< 0.1%
DirectorMichaelAptedrevisitsthesamegroupofBritish-bornadultsaftera7yearwait.Thesubjectsareinterviewedastothechangesthathaveoccurredintheirlivesduringthelastsevenyears. 2
 
< 0.1%
Wilburthepigisscaredoftheendoftheseason,becauseheknowsthatcomethattime,hewillenduponthedinnertable.HehatchesaplanwithCharlotte,aspiderthatlivesinhispen,toensurethatthiswillneverhappen. 2
 
< 0.1%
AwoodenboyBuratinotriestofindhisplaceinlife.HebefriendstoysfromatoytheaterownedbyevilKarabas-Barabas,getstrickedbyAlicetheFoxandBasiliotheCatandfinallydiscoversthemysteryofagoldenkeygiventohimbykindTortilatheTortoise. 2
 
< 0.1%
Other values (44221) 44241
97.6%
(Missing) 946
 
2.1%

Length

2023-07-01T07:47:52.509475image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
nooverviewfound 134
 
0.3%
nooverview 9
 
< 0.1%
nooverviewyet 3
 
< 0.1%
nomovieoverviewavailable 3
 
< 0.1%
adaptationofthejaneaustennovel 3
 
< 0.1%
afewfunnylittlenovelsaboutdifferentaspectsoflife 3
 
< 0.1%
funny,entertainingcomedywithafewstorylines.allofthemhaveonethingincommon-aresorttownofriminiinitaly 2
 
< 0.1%
mary,awriterworkingonanovelaboutalovetriangle,isattractedtoherpublisher.hersuitorjimmyisdeterminedtobreakthemup;heintroducesmarytothepublisher'swifewithouttellingmarywhosheis 2
 
< 0.1%
nickcarraway,ayoungmidwesternernowlivingonlongisland,findshimselffascinatedbythemysteriouspastandlavishlifestyleofhisneighbor,thenouveaurichejaygatsby.heisdrawnintogatsby'scircle,becomingawitnesstoobsessionandtragedy 2
 
< 0.1%
poorbuthappy,youngnelloandhisgrandfatherlivealone,deliveringmilkasalivelihood,intheoutskirtsofantwerp,acityinflanders(theflemishordutch-speakingpartofmodern-daybelgium).theydiscoverabeatendog(abouvier,alargesturdydognativetoflanders)andadoptitandnurseitbacktohealth,namingitpatrasche,themiddlenameofnello'smothermary,whodiedwhennellowasveryyoung.nello'smotherwasatalentedartist,andlikehismother,hedelightsindrawing,andhisfriendaloiseishismodelandgreatestfanandsupporter 2
 
< 0.1%
Other values (44215) 44237
99.6%

Most occurring characters

ValueCountFrequency (%)
e 1362739
 
11.4%
a 939721
 
7.9%
t 934056
 
7.8%
i 850842
 
7.1%
o 829251
 
6.9%
n 821950
 
6.9%
s 767224
 
6.4%
r 743645
 
6.2%
h 600339
 
5.0%
l 478418
 
4.0%
Other values (413) 3621545
30.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11141275
93.2%
Uppercase Letter 390651
 
3.3%
Other Punctuation 312523
 
2.6%
Decimal Number 42192
 
0.4%
Dash Punctuation 36745
 
0.3%
Close Punctuation 10094
 
0.1%
Open Punctuation 10071
 
0.1%
Final Punctuation 4549
 
< 0.1%
Initial Punctuation 880
 
< 0.1%
Currency Symbol 329
 
< 0.1%
Other values (12) 421
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1362739
12.2%
a 939721
 
8.4%
t 934056
 
8.4%
i 850842
 
7.6%
o 829251
 
7.4%
n 821950
 
7.4%
s 767224
 
6.9%
r 743645
 
6.7%
h 600339
 
5.4%
l 478418
 
4.3%
Other values (142) 2813090
25.2%
Uppercase Letter
ValueCountFrequency (%)
A 42722
 
10.9%
T 35940
 
9.2%
S 31102
 
8.0%
M 23942
 
6.1%
B 23679
 
6.1%
C 22771
 
5.8%
H 19415
 
5.0%
W 18633
 
4.8%
I 16782
 
4.3%
D 16306
 
4.2%
Other values (77) 139359
35.7%
Other Letter
ValueCountFrequency (%)
6
 
4.8%
6
 
4.8%
5
 
4.0%
4
 
3.2%
3
 
2.4%
3
 
2.4%
3
 
2.4%
3
 
2.4%
2
 
1.6%
2
 
1.6%
Other values (76) 88
70.4%
Other Punctuation
ValueCountFrequency (%)
, 133326
42.7%
. 124703
39.9%
' 31039
 
9.9%
" 11660
 
3.7%
: 3294
 
1.1%
? 2759
 
0.9%
; 2492
 
0.8%
! 1540
 
0.5%
/ 765
 
0.2%
& 452
 
0.1%
Other values (12) 493
 
0.2%
Nonspacing Mark
ValueCountFrequency (%)
́ 4
12.1%
ి 4
12.1%
3
9.1%
3
9.1%
3
9.1%
̈ 3
9.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
Other values (4) 5
15.2%
Decimal Number
ValueCountFrequency (%)
1 9738
23.1%
0 8262
19.6%
9 6399
15.2%
2 4249
10.1%
5 2439
 
5.8%
8 2378
 
5.6%
3 2338
 
5.5%
4 2173
 
5.2%
7 2131
 
5.1%
6 2085
 
4.9%
Spacing Mark
ValueCountFrequency (%)
11
40.7%
4
 
14.8%
3
 
11.1%
3
 
11.1%
2
 
7.4%
ि 2
 
7.4%
1
 
3.7%
ி 1
 
3.7%
Dash Punctuation
ValueCountFrequency (%)
- 35222
95.9%
881
 
2.4%
633
 
1.7%
5
 
< 0.1%
4
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
® 45
70.3%
14
 
21.9%
° 2
 
3.1%
¦ 2
 
3.1%
1
 
1.6%
Math Symbol
ValueCountFrequency (%)
~ 20
50.0%
+ 11
27.5%
= 6
 
15.0%
| 2
 
5.0%
1
 
2.5%
Open Punctuation
ValueCountFrequency (%)
( 10018
99.5%
[ 50
 
0.5%
{ 2
 
< 0.1%
1
 
< 0.1%
Currency Symbol
ValueCountFrequency (%)
$ 317
96.4%
£ 10
 
3.0%
1
 
0.3%
1
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 10042
99.5%
] 50
 
0.5%
} 2
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
3842
84.5%
688
 
15.1%
» 19
 
0.4%
Initial Punctuation
ValueCountFrequency (%)
670
76.1%
192
 
21.8%
« 18
 
2.0%
Modifier Symbol
ValueCountFrequency (%)
´ 25
65.8%
` 12
31.6%
¯ 1
 
2.6%
Format
ValueCountFrequency (%)
31
60.8%
­ 20
39.2%
Other Number
ValueCountFrequency (%)
½ 8
50.0%
¹ 8
50.0%
Control
ValueCountFrequency (%)
’ 3
75.0%
 1
 
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 19
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%
Modifier Letter
ValueCountFrequency (%)
ʼ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11526694
96.5%
Common 417617
 
3.5%
Cyrillic 4587
 
< 0.1%
Greek 648
 
< 0.1%
Devanagari 77
 
< 0.1%
Telugu 30
 
< 0.1%
Hiragana 20
 
< 0.1%
Tamil 19
 
< 0.1%
Han 10
 
< 0.1%
Hangul 9
 
< 0.1%
Other values (3) 19
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1362739
11.8%
a 939721
 
8.2%
t 934056
 
8.1%
i 850842
 
7.4%
o 829251
 
7.2%
n 821950
 
7.1%
s 767224
 
6.7%
r 743645
 
6.5%
h 600339
 
5.2%
l 478418
 
4.2%
Other values (132) 3198509
27.7%
Common
ValueCountFrequency (%)
, 133326
31.9%
. 124703
29.9%
- 35222
 
8.4%
' 31039
 
7.4%
" 11660
 
2.8%
) 10042
 
2.4%
( 10018
 
2.4%
1 9738
 
2.3%
0 8262
 
2.0%
9 6399
 
1.5%
Other values (65) 37208
 
8.9%
Cyrillic
ValueCountFrequency (%)
о 470
 
10.2%
е 404
 
8.8%
а 373
 
8.1%
н 323
 
7.0%
и 299
 
6.5%
т 265
 
5.8%
р 240
 
5.2%
с 218
 
4.8%
в 173
 
3.8%
л 161
 
3.5%
Other values (46) 1661
36.2%
Greek
ValueCountFrequency (%)
α 60
 
9.3%
ο 55
 
8.5%
τ 43
 
6.6%
η 36
 
5.6%
ι 36
 
5.6%
ν 34
 
5.2%
ρ 31
 
4.8%
ε 31
 
4.8%
π 30
 
4.6%
ς 30
 
4.6%
Other values (33) 262
40.4%
Devanagari
ValueCountFrequency (%)
11
 
14.3%
6
 
7.8%
6
 
7.8%
5
 
6.5%
4
 
5.2%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (21) 30
39.0%
Hiragana
ValueCountFrequency (%)
4
20.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (7) 7
35.0%
Telugu
ValueCountFrequency (%)
ి 4
13.3%
3
10.0%
3
10.0%
3
10.0%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
1
 
3.3%
Other values (6) 6
20.0%
Tamil
ValueCountFrequency (%)
3
15.8%
2
10.5%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Other values (3) 3
15.8%
Han
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Hangul
ValueCountFrequency (%)
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Thai
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Arabic
ValueCountFrequency (%)
م 2
50.0%
ہ 1
25.0%
ت 1
25.0%
Inherited
ValueCountFrequency (%)
́ 4
57.1%
̈ 3
42.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11931797
99.8%
Punctuation 7252
 
0.1%
None 5883
 
< 0.1%
Cyrillic 4587
 
< 0.1%
Devanagari 77
 
< 0.1%
Telugu 30
 
< 0.1%
Hiragana 20
 
< 0.1%
Tamil 19
 
< 0.1%
Letterlike Symbols 14
 
< 0.1%
CJK 10
 
< 0.1%
Other values (11) 41
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1362739
 
11.4%
a 939721
 
7.9%
t 934056
 
7.8%
i 850842
 
7.1%
o 829251
 
6.9%
n 821950
 
6.9%
s 767224
 
6.4%
r 743645
 
6.2%
h 600339
 
5.0%
l 478418
 
4.0%
Other values (80) 3603612
30.2%
Punctuation
ValueCountFrequency (%)
3842
53.0%
881
 
12.1%
688
 
9.5%
670
 
9.2%
633
 
8.7%
303
 
4.2%
192
 
2.6%
31
 
0.4%
5
 
0.1%
4
 
0.1%
Other values (2) 3
 
< 0.1%
None
ValueCountFrequency (%)
é 1544
26.2%
ä 294
 
5.0%
á 293
 
5.0%
ö 250
 
4.2%
í 243
 
4.1%
è 209
 
3.6%
ü 178
 
3.0%
ı 165
 
2.8%
ó 164
 
2.8%
ç 158
 
2.7%
Other values (139) 2385
40.5%
Cyrillic
ValueCountFrequency (%)
о 470
 
10.2%
е 404
 
8.8%
а 373
 
8.1%
н 323
 
7.0%
и 299
 
6.5%
т 265
 
5.8%
р 240
 
5.2%
с 218
 
4.8%
в 173
 
3.8%
л 161
 
3.5%
Other values (46) 1661
36.2%
Letterlike Symbols
ValueCountFrequency (%)
14
100.0%
Devanagari
ValueCountFrequency (%)
11
 
14.3%
6
 
7.8%
6
 
7.8%
5
 
6.5%
4
 
5.2%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (21) 30
39.0%
Diacriticals
ValueCountFrequency (%)
́ 4
57.1%
̈ 3
42.9%
Hiragana
ValueCountFrequency (%)
4
20.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (7) 7
35.0%
Alphabetic PF
ValueCountFrequency (%)
4
100.0%
Telugu
ValueCountFrequency (%)
ి 4
13.3%
3
10.0%
3
10.0%
3
10.0%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
1
 
3.3%
Other values (6) 6
20.0%
Tamil
ValueCountFrequency (%)
3
15.8%
2
10.5%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Other values (3) 3
15.8%
Number Forms
ValueCountFrequency (%)
2
100.0%
Hangul
ValueCountFrequency (%)
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Arabic
ValueCountFrequency (%)
م 2
50.0%
ہ 1
25.0%
ت 1
25.0%
Modifier Letters
ValueCountFrequency (%)
ʼ 2
100.0%
Thai
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
CJK
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Math Operators
ValueCountFrequency (%)
1
100.0%
Specials
ValueCountFrequency (%)
1
100.0%
Katakana
ValueCountFrequency (%)
1
100.0%
Currency Symbols
ValueCountFrequency (%)
1
50.0%
1
50.0%

popularity
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct43719
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.926188
Minimum0
Maximum547.4883
Zeros40
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size354.4 KiB
2023-07-01T07:47:53.041665image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.020823
Q10.38873225
median1.130176
Q33.6893365
95-th percentile11.063757
Maximum547.4883
Range547.4883
Interquartile range (IQR)3.3006043

Descriptive statistics

Standard deviation6.0109699
Coefficient of variation (CV)2.0541981
Kurtosis1923.3033
Mean2.926188
Median Absolute Deviation (MAD)0.967289
Skewness29.215423
Sum132690.92
Variance36.131759
MonotonicityNot monotonic
2023-07-01T07:47:53.511915image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 × 10-656
 
0.1%
0.000308 42
 
0.1%
0 40
 
0.1%
0.00022 39
 
0.1%
0.001177 38
 
0.1%
0.000844 38
 
0.1%
0.000578 38
 
0.1%
0.002001 27
 
0.1%
0.003013 21
 
< 0.1%
0.00353 19
 
< 0.1%
Other values (43709) 44988
99.2%
ValueCountFrequency (%)
0 40
0.1%
1 × 10-656
0.1%
2 × 10-66
 
< 0.1%
3 × 10-66
 
< 0.1%
4 × 10-65
 
< 0.1%
5 × 10-61
 
< 0.1%
6 × 10-62
 
< 0.1%
7 × 10-61
 
< 0.1%
8 × 10-66
 
< 0.1%
9 × 10-62
 
< 0.1%
ValueCountFrequency (%)
547.488298 1
< 0.1%
294.337037 1
< 0.1%
287.253654 1
< 0.1%
228.032744 1
< 0.1%
213.849907 1
< 0.1%
187.860492 1
< 0.1%
185.330992 1
< 0.1%
185.070892 1
< 0.1%
183.870374 1
< 0.1%
154.801009 1
< 0.1%

vote_average
Real number (ℝ)

Distinct92
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.6241962
Minimum0
Maximum10
Zeros2944
Zeros (%)6.5%
Negative0
Negative (%)0.0%
Memory size354.4 KiB
2023-07-01T07:47:54.027619image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median6
Q36.8
95-th percentile7.8
Maximum10
Range10
Interquartile range (IQR)1.8

Descriptive statistics

Standard deviation1.915339
Coefficient of variation (CV)0.34055337
Kurtosis2.5420383
Mean5.6241962
Median Absolute Deviation (MAD)0.9
Skewness-1.5243174
Sum255034.8
Variance3.6685234
MonotonicityNot monotonic
2023-07-01T07:47:54.527612image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2944
 
6.5%
6 2461
 
5.4%
5 1994
 
4.4%
7 1882
 
4.2%
6.5 1722
 
3.8%
6.3 1602
 
3.5%
5.5 1381
 
3.0%
5.8 1369
 
3.0%
6.4 1348
 
3.0%
6.7 1339
 
3.0%
Other values (82) 27304
60.2%
ValueCountFrequency (%)
0 2944
6.5%
0.5 13
 
< 0.1%
0.7 1
 
< 0.1%
1 103
 
0.2%
1.1 1
 
< 0.1%
1.2 4
 
< 0.1%
1.3 13
 
< 0.1%
1.4 5
 
< 0.1%
1.5 30
 
0.1%
1.6 6
 
< 0.1%
ValueCountFrequency (%)
10 185
0.4%
9.8 1
 
< 0.1%
9.6 1
 
< 0.1%
9.5 18
 
< 0.1%
9.4 3
 
< 0.1%
9.3 18
 
< 0.1%
9.2 4
 
< 0.1%
9.1 2
 
< 0.1%
9 158
0.3%
8.9 7
 
< 0.1%

vote_count
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1820
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean110.13529
Minimum0
Maximum14075
Zeros2846
Zeros (%)6.3%
Negative0
Negative (%)0.0%
Memory size354.4 KiB
2023-07-01T07:47:55.028501image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median10
Q334
95-th percentile434.75
Maximum14075
Range14075
Interquartile range (IQR)31

Descriptive statistics

Standard deviation491.89928
Coefficient of variation (CV)4.4663183
Kurtosis150.83135
Mean110.13529
Median Absolute Deviation (MAD)8
Skewness10.437494
Sum4994195
Variance241964.9
MonotonicityNot monotonic
2023-07-01T07:47:55.513095image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3240
 
7.1%
2 3127
 
6.9%
0 2846
 
6.3%
3 2780
 
6.1%
4 2477
 
5.5%
5 2096
 
4.6%
6 1747
 
3.9%
7 1568
 
3.5%
8 1359
 
3.0%
9 1194
 
2.6%
Other values (1810) 22912
50.5%
ValueCountFrequency (%)
0 2846
6.3%
1 3240
7.1%
2 3127
6.9%
3 2780
6.1%
4 2477
5.5%
5 2096
4.6%
6 1747
3.9%
7 1568
3.5%
8 1359
3.0%
9 1194
 
2.6%
ValueCountFrequency (%)
14075 1
< 0.1%
12269 1
< 0.1%
12114 1
< 0.1%
12000 1
< 0.1%
11444 1
< 0.1%
11187 1
< 0.1%
10297 1
< 0.1%
10014 1
< 0.1%
9678 1
< 0.1%
9634 1
< 0.1%

status
Categorical

Distinct6
Distinct (%)< 0.1%
Missing80
Missing (%)0.2%
Memory size354.4 KiB
Released
44907 
Rumored
 
229
PostProduction
 
97
InProduction
 
19
Planned
 
13

Length

Max length14
Median length8
Mean length8.0091901
Min length7

Characters and Unicode

Total characters362544
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowReleased
2nd rowReleased
3rd rowReleased
4th rowReleased
5th rowReleased

Common Values

ValueCountFrequency (%)
Released 44907
99.0%
Rumored 229
 
0.5%
PostProduction 97
 
0.2%
InProduction 19
 
< 0.1%
Planned 13
 
< 0.1%
Canceled 1
 
< 0.1%
(Missing) 80
 
0.2%

Length

2023-07-01T07:47:56.013529image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-01T07:47:56.498288image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
released 44907
99.2%
rumored 229
 
0.5%
postproduction 97
 
0.2%
inproduction 19
 
< 0.1%
planned 13
 
< 0.1%
canceled 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 134965
37.2%
d 45266
 
12.5%
R 45136
 
12.4%
s 45004
 
12.4%
l 44921
 
12.4%
a 44921
 
12.4%
o 558
 
0.2%
u 345
 
0.1%
r 345
 
0.1%
m 229
 
0.1%
Other values (7) 854
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 317162
87.5%
Uppercase Letter 45382
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 134965
42.6%
d 45266
 
14.3%
s 45004
 
14.2%
l 44921
 
14.2%
a 44921
 
14.2%
o 558
 
0.2%
u 345
 
0.1%
r 345
 
0.1%
m 229
 
0.1%
t 213
 
0.1%
Other values (3) 395
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
R 45136
99.5%
P 226
 
0.5%
I 19
 
< 0.1%
C 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 362544
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 134965
37.2%
d 45266
 
12.5%
R 45136
 
12.4%
s 45004
 
12.4%
l 44921
 
12.4%
a 44921
 
12.4%
o 558
 
0.2%
u 345
 
0.1%
r 345
 
0.1%
m 229
 
0.1%
Other values (7) 854
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 362544
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 134965
37.2%
d 45266
 
12.5%
R 45136
 
12.4%
s 45004
 
12.4%
l 44921
 
12.4%
a 44921
 
12.4%
o 558
 
0.2%
u 345
 
0.1%
r 345
 
0.1%
m 229
 
0.1%
Other values (7) 854
 
0.2%

original_language
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct89
Distinct (%)0.2%
Missing11
Missing (%)< 0.1%
Memory size354.4 KiB
en
32184 
fr
 
2435
it
 
1528
ja
 
1346
de
 
1077
Other values (84)
6765 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters90670
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)< 0.1%

Sample

1st rowen
2nd rowen
3rd rowen
4th rowen
5th rowen

Common Values

ValueCountFrequency (%)
en 32184
71.0%
fr 2435
 
5.4%
it 1528
 
3.4%
ja 1346
 
3.0%
de 1077
 
2.4%
es 992
 
2.2%
ru 822
 
1.8%
hi 508
 
1.1%
ko 444
 
1.0%
zh 408
 
0.9%
Other values (79) 3591
 
7.9%

Length

2023-07-01T07:47:56.905190image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
en 32184
71.0%
fr 2435
 
5.4%
it 1528
 
3.4%
ja 1346
 
3.0%
de 1077
 
2.4%
es 992
 
2.2%
ru 822
 
1.8%
hi 508
 
1.1%
ko 444
 
1.0%
zh 408
 
0.9%
Other values (79) 3591
 
7.9%

Most occurring characters

ValueCountFrequency (%)
e 34508
38.1%
n 32892
36.3%
r 3628
 
4.0%
f 2830
 
3.1%
i 2386
 
2.6%
t 2249
 
2.5%
a 1834
 
2.0%
s 1651
 
1.8%
j 1347
 
1.5%
d 1321
 
1.5%
Other values (16) 6024
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 90670
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 34508
38.1%
n 32892
36.3%
r 3628
 
4.0%
f 2830
 
3.1%
i 2386
 
2.6%
t 2249
 
2.5%
a 1834
 
2.0%
s 1651
 
1.8%
j 1347
 
1.5%
d 1321
 
1.5%
Other values (16) 6024
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 90670
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 34508
38.1%
n 32892
36.3%
r 3628
 
4.0%
f 2830
 
3.1%
i 2386
 
2.6%
t 2249
 
2.5%
a 1834
 
2.0%
s 1651
 
1.8%
j 1347
 
1.5%
d 1321
 
1.5%
Other values (16) 6024
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90670
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 34508
38.1%
n 32892
36.3%
r 3628
 
4.0%
f 2830
 
3.1%
i 2386
 
2.6%
t 2249
 
2.5%
a 1834
 
2.0%
s 1651
 
1.8%
j 1347
 
1.5%
d 1321
 
1.5%
Other values (16) 6024
 
6.6%

runtime
Real number (ℝ)

Distinct353
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean93.666895
Minimum0
Maximum1256
Zeros1781
Zeros (%)3.9%
Negative0
Negative (%)0.0%
Memory size354.4 KiB
2023-07-01T07:47:57.343227image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8
Q185
median95
Q3107
95-th percentile138
Maximum1256
Range1256
Interquartile range (IQR)22

Descriptive statistics

Standard deviation38.865238
Coefficient of variation (CV)0.41493036
Kurtosis88.775055
Mean93.666895
Median Absolute Deviation (MAD)11
Skewness4.2532768
Sum4247419
Variance1510.5067
MonotonicityNot monotonic
2023-07-01T07:47:57.858838image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90 2548
 
5.6%
0 1781
 
3.9%
100 1470
 
3.2%
95 1409
 
3.1%
93 1212
 
2.7%
96 1104
 
2.4%
92 1078
 
2.4%
94 1061
 
2.3%
91 1055
 
2.3%
88 1030
 
2.3%
Other values (343) 31598
69.7%
ValueCountFrequency (%)
0 1781
3.9%
1 107
 
0.2%
2 33
 
0.1%
3 48
 
0.1%
4 50
 
0.1%
5 51
 
0.1%
6 72
 
0.2%
7 103
 
0.2%
8 78
 
0.2%
9 63
 
0.1%
ValueCountFrequency (%)
1256 1
< 0.1%
1140 2
< 0.1%
931 1
< 0.1%
925 1
< 0.1%
900 1
< 0.1%
877 1
< 0.1%
874 1
< 0.1%
840 2
< 0.1%
780 1
< 0.1%
720 1
< 0.1%

budget
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1223
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4232579.8
Minimum0
Maximum3.8 × 108
Zeros36470
Zeros (%)80.4%
Negative0
Negative (%)0.0%
Memory size354.4 KiB
2023-07-01T07:47:58.718523image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile25000000
Maximum3.8 × 108
Range3.8 × 108
Interquartile range (IQR)0

Descriptive statistics

Standard deviation17443731
Coefficient of variation (CV)4.1213
Kurtosis66.618217
Mean4232579.8
Median Absolute Deviation (MAD)0
Skewness7.1180066
Sum1.9193056 × 1011
Variance3.0428374 × 1014
MonotonicityNot monotonic
2023-07-01T07:47:59.234438image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 36470
80.4%
5000000 286
 
0.6%
10000000 258
 
0.6%
20000000 243
 
0.5%
2000000 242
 
0.5%
15000000 226
 
0.5%
3000000 223
 
0.5%
25000000 206
 
0.5%
1000000 197
 
0.4%
30000000 189
 
0.4%
Other values (1213) 6806
 
15.0%
ValueCountFrequency (%)
0 36470
80.4%
1 25
 
0.1%
2 14
 
< 0.1%
3 9
 
< 0.1%
4 7
 
< 0.1%
5 8
 
< 0.1%
6 5
 
< 0.1%
7 4
 
< 0.1%
8 5
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
380000000 1
 
< 0.1%
300000000 1
 
< 0.1%
280000000 1
 
< 0.1%
270000000 1
 
< 0.1%
260000000 3
 
< 0.1%
258000000 1
 
< 0.1%
255000000 1
 
< 0.1%
250000000 10
< 0.1%
245000000 2
 
< 0.1%
237000000 1
 
< 0.1%

revenue
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6863
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11233655
Minimum0
Maximum2.7879651 × 109
Zeros37949
Zeros (%)83.7%
Negative0
Negative (%)0.0%
Memory size354.4 KiB
2023-07-01T07:47:59.765660image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile48025328
Maximum2.7879651 × 109
Range2.7879651 × 109
Interquartile range (IQR)0

Descriptive statistics

Standard deviation64409896
Coefficient of variation (CV)5.7336544
Kurtosis236.93621
Mean11233655
Median Absolute Deviation (MAD)0
Skewness12.251264
Sum5.0940133 × 1011
Variance4.1486347 × 1015
MonotonicityNot monotonic
2023-07-01T07:48:00.282154image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 37949
83.7%
12000000 20
 
< 0.1%
11000000 19
 
< 0.1%
10000000 19
 
< 0.1%
2000000 18
 
< 0.1%
6000000 17
 
< 0.1%
5000000 14
 
< 0.1%
8000000 13
 
< 0.1%
500000 13
 
< 0.1%
1 12
 
< 0.1%
Other values (6853) 7252
 
16.0%
ValueCountFrequency (%)
0 37949
83.7%
1 12
 
< 0.1%
2 3
 
< 0.1%
3 9
 
< 0.1%
4 4
 
< 0.1%
5 5
 
< 0.1%
6 2
 
< 0.1%
7 4
 
< 0.1%
8 5
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
2787965087 1
< 0.1%
2068223624 1
< 0.1%
1845034188 1
< 0.1%
1519557910 1
< 0.1%
1513528810 1
< 0.1%
1506249360 1
< 0.1%
1405403694 1
< 0.1%
1342000000 1
< 0.1%
1274219009 1
< 0.1%
1262886337 1
< 0.1%

tagline
Categorical

HIGH CARDINALITY  MISSING  UNIFORM 

Distinct20268
Distinct (%)99.4%
Missing24960
Missing (%)55.0%
Memory size354.4 KiB
Basedonatruestory.
 
7
Trustnoone.
 
4
-
 
4
Becarefulwhatyouwishfor.
 
4
KnowYourEnemy
 
3
Other values (20263)
20364 

Length

Max length259
Median length179
Mean length39.464093
Min length1

Characters and Unicode

Total characters804515
Distinct characters169
Distinct categories16 ?
Distinct scripts6 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20172 ?
Unique (%)99.0%

Sample

1st rowRollthediceandunleashtheexcitement!
2nd rowStillYelling.StillFighting.StillReadyforLove.
3rd rowFriendsarethepeoplewholetyoubeyourself...andneverletyouforgetit.
4th rowJustWhenHisWorldIsBackToNormal...He'sInForTheSurpriseOfHisLife!
5th rowALosAngelesCrimeSaga

Common Values

ValueCountFrequency (%)
Basedonatruestory. 7
 
< 0.1%
Trustnoone. 4
 
< 0.1%
- 4
 
< 0.1%
Becarefulwhatyouwishfor. 4
 
< 0.1%
KnowYourEnemy 3
 
< 0.1%
ClassicAlbums 3
 
< 0.1%
Documentary 3
 
< 0.1%
Howfarwouldyougo? 3
 
< 0.1%
WhoisJohnGalt? 3
 
< 0.1%
Drama 3
 
< 0.1%
Other values (20258) 20349
44.9%
(Missing) 24960
55.0%

Length

2023-07-01T07:48:00.955799image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
basedonatruestory 11
 
0.1%
trustnoone 7
 
< 0.1%
becarefulwhatyouwishfor 7
 
< 0.1%
alovestory 5
 
< 0.1%
atruestory 5
 
< 0.1%
knowyourenemy 4
 
< 0.1%
documentary 4
 
< 0.1%
fightfirewithfire 4
 
< 0.1%
4
 
< 0.1%
twofilms.onelove 3
 
< 0.1%
Other values (20091) 20332
99.7%

Most occurring characters

ValueCountFrequency (%)
e 94342
 
11.7%
t 57223
 
7.1%
o 56534
 
7.0%
a 51450
 
6.4%
n 47460
 
5.9%
i 46013
 
5.7%
r 44957
 
5.6%
s 42345
 
5.3%
h 37144
 
4.6%
l 30159
 
3.7%
Other values (159) 296888
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 680045
84.5%
Uppercase Letter 74965
 
9.3%
Other Punctuation 44556
 
5.5%
Decimal Number 2687
 
0.3%
Dash Punctuation 1942
 
0.2%
Final Punctuation 98
 
< 0.1%
Open Punctuation 56
 
< 0.1%
Close Punctuation 55
 
< 0.1%
Currency Symbol 37
 
< 0.1%
Other Letter 34
 
< 0.1%
Other values (6) 40
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 94342
13.9%
t 57223
 
8.4%
o 56534
 
8.3%
a 51450
 
7.6%
n 47460
 
7.0%
i 46013
 
6.8%
r 44957
 
6.6%
s 42345
 
6.2%
h 37144
 
5.5%
l 30159
 
4.4%
Other values (43) 172418
25.4%
Other Letter
ValueCountFrequency (%)
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (24) 24
70.6%
Uppercase Letter
ValueCountFrequency (%)
T 10007
 
13.3%
A 6871
 
9.2%
S 5648
 
7.5%
H 4401
 
5.9%
I 4387
 
5.9%
E 4304
 
5.7%
W 3678
 
4.9%
O 3476
 
4.6%
L 3193
 
4.3%
N 3193
 
4.3%
Other values (20) 25807
34.4%
Other Punctuation
ValueCountFrequency (%)
. 26640
59.8%
! 5784
 
13.0%
' 5659
 
12.7%
, 4222
 
9.5%
? 1159
 
2.6%
" 582
 
1.3%
148
 
0.3%
: 137
 
0.3%
& 83
 
0.2%
* 42
 
0.1%
Other values (7) 100
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 802
29.8%
1 516
19.2%
2 299
 
11.1%
3 208
 
7.7%
9 208
 
7.7%
5 168
 
6.3%
4 140
 
5.2%
6 121
 
4.5%
7 121
 
4.5%
8 104
 
3.9%
Math Symbol
ValueCountFrequency (%)
+ 5
35.7%
= 5
35.7%
| 2
 
14.3%
~ 1
 
7.1%
1
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 1925
99.1%
9
 
0.5%
8
 
0.4%
Final Punctuation
ValueCountFrequency (%)
82
83.7%
15
 
15.3%
» 1
 
1.0%
Initial Punctuation
ValueCountFrequency (%)
14
73.7%
4
 
21.1%
« 1
 
5.3%
Open Punctuation
ValueCountFrequency (%)
( 49
87.5%
[ 7
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 48
87.3%
] 7
 
12.7%
Other Number
ValueCountFrequency (%)
½ 2
66.7%
² 1
33.3%
Modifier Letter
ValueCountFrequency (%)
ˌ 1
50.0%
ˈ 1
50.0%
Currency Symbol
ValueCountFrequency (%)
$ 37
100.0%
Nonspacing Mark
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 755010
93.8%
Common 49470
 
6.1%
Han 21
 
< 0.1%
Tamil 5
 
< 0.1%
Hiragana 5
 
< 0.1%
Katakana 4
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 94342
 
12.5%
t 57223
 
7.6%
o 56534
 
7.5%
a 51450
 
6.8%
n 47460
 
6.3%
i 46013
 
6.1%
r 44957
 
6.0%
s 42345
 
5.6%
h 37144
 
4.9%
l 30159
 
4.0%
Other values (73) 247383
32.8%
Common
ValueCountFrequency (%)
. 26640
53.9%
! 5784
 
11.7%
' 5659
 
11.4%
, 4222
 
8.5%
- 1925
 
3.9%
? 1159
 
2.3%
0 802
 
1.6%
" 582
 
1.2%
1 516
 
1.0%
2 299
 
0.6%
Other values (41) 1882
 
3.8%
Han
ValueCountFrequency (%)
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (11) 11
52.4%
Tamil
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Hiragana
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Katakana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 804086
99.9%
Punctuation 280
 
< 0.1%
None 109
 
< 0.1%
CJK 21
 
< 0.1%
Tamil 5
 
< 0.1%
Hiragana 5
 
< 0.1%
Katakana 4
 
< 0.1%
IPA Ext 2
 
< 0.1%
Modifier Letters 2
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 94342
 
11.7%
t 57223
 
7.1%
o 56534
 
7.0%
a 51450
 
6.4%
n 47460
 
5.9%
i 46013
 
5.7%
r 44957
 
5.6%
s 42345
 
5.3%
h 37144
 
4.6%
l 30159
 
3.8%
Other values (77) 296459
36.9%
Punctuation
ValueCountFrequency (%)
148
52.9%
82
29.3%
15
 
5.4%
14
 
5.0%
9
 
3.2%
8
 
2.9%
4
 
1.4%
None
ValueCountFrequency (%)
é 17
15.6%
ä 16
14.7%
ö 8
 
7.3%
ó 6
 
5.5%
á 6
 
5.5%
ü 5
 
4.6%
ı 5
 
4.6%
í 5
 
4.6%
· 4
 
3.7%
ñ 3
 
2.8%
Other values (26) 34
31.2%
IPA Ext
ValueCountFrequency (%)
ə 2
100.0%
CJK
ValueCountFrequency (%)
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Other values (11) 11
52.4%
Tamil
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Katakana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Modifier Letters
ValueCountFrequency (%)
ˌ 1
50.0%
ˈ 1
50.0%
Hiragana
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

id_btc
Real number (ℝ)

Distinct1078
Distinct (%)34.1%
Missing42183
Missing (%)93.0%
Infinite0
Infinite (%)0.0%
Mean158900.63
Minimum10
Maximum479888
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size354.4 KiB
2023-07-01T07:48:01.502690image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile1617
Q144913
median115822
Q3253474.5
95-th percentile425164
Maximum479888
Range479878
Interquartile range (IQR)208561.5

Descriptive statistics

Standard deviation136342.15
Coefficient of variation (CV)0.85803401
Kurtosis-0.50880722
Mean158900.63
Median Absolute Deviation (MAD)91105
Skewness0.7806269
Sum5.0260271 × 108
Variance1.8589182 × 1010
MonotonicityNot monotonic
2023-07-01T07:48:02.034317image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
415931 29
 
0.1%
421566 27
 
0.1%
96887 26
 
0.1%
645 26
 
0.1%
37261 25
 
0.1%
34055 20
 
< 0.1%
374509 16
 
< 0.1%
38451 15
 
< 0.1%
425164 15
 
< 0.1%
19163 14
 
< 0.1%
Other values (1068) 2950
 
6.5%
(Missing) 42183
93.0%
ValueCountFrequency (%)
10 8
< 0.1%
84 4
< 0.1%
119 3
 
< 0.1%
131 3
 
< 0.1%
151 6
< 0.1%
230 3
 
< 0.1%
263 3
 
< 0.1%
264 3
 
< 0.1%
295 5
< 0.1%
328 4
< 0.1%
ValueCountFrequency (%)
479888 2
 
< 0.1%
479549 1
 
< 0.1%
478947 2
 
< 0.1%
478628 12
< 0.1%
478442 1
 
< 0.1%
476066 1
 
< 0.1%
476065 2
 
< 0.1%
476063 2
 
< 0.1%
476056 2
 
< 0.1%
476054 2
 
< 0.1%

name_btc
Categorical

HIGH CARDINALITY  MISSING 

Distinct1078
Distinct (%)34.1%
Missing42183
Missing (%)93.0%
Memory size354.4 KiB
TheBoweryBoys
 
29
TotòCollection
 
27
Zatôichi:TheBlindSwordsman
 
26
JamesBondCollection
 
26
TheCarryOnCollection
 
25
Other values (1073)
3030 

Length

Max length45
Median length37
Mean length21.648435
Min length7

Characters and Unicode

Total characters68474
Distinct characters135
Distinct categories11 ?
Distinct scripts6 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique170 ?
Unique (%)5.4%

Sample

1st rowToyStoryCollection
2nd rowGrumpyOldMenCollection
3rd rowFatheroftheBrideCollection
4th rowJamesBondCollection
5th rowBaltoCollection

Common Values

ValueCountFrequency (%)
TheBoweryBoys 29
 
0.1%
TotòCollection 27
 
0.1%
Zatôichi:TheBlindSwordsman 26
 
0.1%
JamesBondCollection 26
 
0.1%
TheCarryOnCollection 25
 
0.1%
PokémonCollection 20
 
< 0.1%
Godzilla(Showa)Collection 16
 
< 0.1%
CharlieChan(WarnerOland)Collection 15
 
< 0.1%
DragonBallZ(Movie)Collection 15
 
< 0.1%
TheLandBeforeTimeCollection 14
 
< 0.1%
Other values (1068) 2950
 
6.5%
(Missing) 42183
93.0%

Length

2023-07-01T07:48:02.596899image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
theboweryboys 29
 
0.9%
totòcollection 27
 
0.9%
zatôichi:theblindswordsman 26
 
0.8%
jamesbondcollection 26
 
0.8%
thecarryoncollection 25
 
0.8%
pokémoncollection 20
 
0.6%
godzilla(showa)collection 16
 
0.5%
charliechan(warneroland)collection 15
 
0.5%
dragonballz(movie)collection 15
 
0.5%
thelandbeforetimecollection 14
 
0.4%
Other values (1068) 2950
93.3%

Most occurring characters

ValueCountFrequency (%)
o 7970
11.6%
e 7434
10.9%
l 7290
10.6%
n 5396
 
7.9%
i 5388
 
7.9%
t 4726
 
6.9%
c 3467
 
5.1%
C 3181
 
4.6%
a 3088
 
4.5%
r 2715
 
4.0%
Other values (125) 17819
26.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 57455
83.9%
Uppercase Letter 9808
 
14.3%
Other Punctuation 332
 
0.5%
Open Punctuation 247
 
0.4%
Close Punctuation 247
 
0.4%
Decimal Number 242
 
0.4%
Dash Punctuation 108
 
0.2%
Other Letter 27
 
< 0.1%
Final Punctuation 3
 
< 0.1%
Modifier Letter 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 7970
13.9%
e 7434
12.9%
l 7290
12.7%
n 5396
9.4%
i 5388
9.4%
t 4726
8.2%
c 3467
6.0%
a 3088
 
5.4%
r 2715
 
4.7%
s 1797
 
3.1%
Other values (53) 8184
14.2%
Uppercase Letter
ValueCountFrequency (%)
C 3181
32.4%
T 1050
 
10.7%
S 746
 
7.6%
B 490
 
5.0%
M 443
 
4.5%
A 393
 
4.0%
D 370
 
3.8%
H 362
 
3.7%
P 305
 
3.1%
G 289
 
2.9%
Other values (25) 2179
22.2%
Decimal Number
ValueCountFrequency (%)
1 61
25.2%
3 44
18.2%
9 44
18.2%
0 39
16.1%
2 18
 
7.4%
8 12
 
5.0%
5 10
 
4.1%
6 6
 
2.5%
7 6
 
2.5%
4 2
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 112
33.7%
: 83
25.0%
, 50
15.1%
& 44
 
13.3%
! 18
 
5.4%
/ 17
 
5.1%
3
 
0.9%
? 3
 
0.9%
* 2
 
0.6%
Other Letter
ValueCountFrequency (%)
3
11.1%
3
11.1%
3
11.1%
3
11.1%
3
11.1%
3
11.1%
3
11.1%
3
11.1%
3
11.1%
Open Punctuation
ValueCountFrequency (%)
( 243
98.4%
[ 4
 
1.6%
Close Punctuation
ValueCountFrequency (%)
) 243
98.4%
] 4
 
1.6%
Dash Punctuation
ValueCountFrequency (%)
- 106
98.1%
2
 
1.9%
Final Punctuation
ValueCountFrequency (%)
3
100.0%
Modifier Letter
ValueCountFrequency (%)
3
100.0%
Other Number
ValueCountFrequency (%)
½ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 67187
98.1%
Common 1184
 
1.7%
Cyrillic 76
 
0.1%
Hiragana 15
 
< 0.1%
Katakana 9
 
< 0.1%
Han 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 7970
11.9%
e 7434
11.1%
l 7290
10.9%
n 5396
 
8.0%
i 5388
 
8.0%
t 4726
 
7.0%
c 3467
 
5.2%
C 3181
 
4.7%
a 3088
 
4.6%
r 2715
 
4.0%
Other values (65) 16532
24.6%
Common
ValueCountFrequency (%)
( 243
20.5%
) 243
20.5%
. 112
9.5%
- 106
9.0%
: 83
 
7.0%
1 61
 
5.2%
, 50
 
4.2%
& 44
 
3.7%
3 44
 
3.7%
9 44
 
3.7%
Other values (18) 154
13.0%
Cyrillic
ValueCountFrequency (%)
л 8
 
10.5%
о 8
 
10.5%
и 7
 
9.2%
к 7
 
9.2%
а 6
 
7.9%
р 5
 
6.6%
е 5
 
6.6%
я 4
 
5.3%
ц 3
 
3.9%
К 3
 
3.9%
Other values (13) 20
26.3%
Hiragana
ValueCountFrequency (%)
3
20.0%
3
20.0%
3
20.0%
3
20.0%
3
20.0%
Katakana
ValueCountFrequency (%)
3
33.3%
3
33.3%
3
33.3%
Han
ValueCountFrequency (%)
3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 68203
99.6%
None 157
 
0.2%
Cyrillic 76
 
0.1%
Hiragana 15
 
< 0.1%
Katakana 12
 
< 0.1%
Punctuation 8
 
< 0.1%
CJK 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 7970
11.7%
e 7434
10.9%
l 7290
10.7%
n 5396
 
7.9%
i 5388
 
7.9%
t 4726
 
6.9%
c 3467
 
5.1%
C 3181
 
4.7%
a 3088
 
4.5%
r 2715
 
4.0%
Other values (65) 17548
25.7%
None
ValueCountFrequency (%)
ô 29
18.5%
é 28
17.8%
ò 27
17.2%
ä 16
10.2%
ı 14
8.9%
ö 11
 
7.0%
í 5
 
3.2%
İ 4
 
2.5%
Ç 2
 
1.3%
ü 2
 
1.3%
Other values (14) 19
12.1%
Cyrillic
ValueCountFrequency (%)
л 8
 
10.5%
о 8
 
10.5%
и 7
 
9.2%
к 7
 
9.2%
а 6
 
7.9%
р 5
 
6.6%
е 5
 
6.6%
я 4
 
5.3%
ц 3
 
3.9%
К 3
 
3.9%
Other values (13) 20
26.3%
Katakana
ValueCountFrequency (%)
3
25.0%
3
25.0%
3
25.0%
3
25.0%
Punctuation
ValueCountFrequency (%)
3
37.5%
3
37.5%
2
25.0%
CJK
ValueCountFrequency (%)
3
100.0%
Hiragana
ValueCountFrequency (%)
3
20.0%
3
20.0%
3
20.0%
3
20.0%
3
20.0%

poster_btc
Categorical

HIGH CARDINALITY  MISSING 

Distinct1078
Distinct (%)34.1%
Missing42183
Missing (%)93.0%
Memory size354.4 KiB
/q6sA4bzMT9cK7EEmXYwt7PNrL5h.jpg
 
29
/4ayJsjC3djGwU9eCWUokdBWvdLC.jpg
 
27
/8Q31DAtmFJjhFTwQGXghBUCgWK2.jpg
 
26
/HORpg5CSkmeQlAolx3bKMrKgfi.jpg
 
26
/2P0HNrYgKDvirV8RCdT1rBSJdbJ.jpg
 
25
Other values (1073)
3030 

Length

Max length32
Median length32
Mean length31.957951
Min length31

Characters and Unicode

Total characters101083
Distinct characters64
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique170 ?
Unique (%)5.4%

Sample

1st row/7G9915LfUQ2lVfwMEEhDsn3kT4B.jpg
2nd row/nLvUdqgPgm3F85NMCii9gVFUcet.jpg
3rd row/nts4iOmNnq7GNicycMJ9pSAn204.jpg
4th row/HORpg5CSkmeQlAolx3bKMrKgfi.jpg
5th row/w0ZgH6Lgxt2bQYnf1ss74UvYftm.jpg

Common Values

ValueCountFrequency (%)
/q6sA4bzMT9cK7EEmXYwt7PNrL5h.jpg 29
 
0.1%
/4ayJsjC3djGwU9eCWUokdBWvdLC.jpg 27
 
0.1%
/8Q31DAtmFJjhFTwQGXghBUCgWK2.jpg 26
 
0.1%
/HORpg5CSkmeQlAolx3bKMrKgfi.jpg 26
 
0.1%
/2P0HNrYgKDvirV8RCdT1rBSJdbJ.jpg 25
 
0.1%
/j5te0YNZAMXDBnsqTUDKIBEt8iu.jpg 20
 
< 0.1%
/scvwS6k8gIW8w24UcmePQqVL10l.jpg 16
 
< 0.1%
/eSDdu6pbocmayu1SXQFU9VYYoQ6.jpg 15
 
< 0.1%
/2VMZ1zRFPnUQtQp5K4WRXvDYBjh.jpg 15
 
< 0.1%
/n1bjdBVThBezxR6nEf2dy43sTtV.jpg 14
 
< 0.1%
Other values (1068) 2950
 
6.5%
(Missing) 42183
93.0%

Length

2023-07-01T07:48:03.096834image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
q6sa4bzmt9ck7eemxywt7pnrl5h.jpg 29
 
0.9%
4ayjsjc3djgwu9ecwuokdbwvdlc.jpg 27
 
0.9%
8q31datmfjjhftwqgxghbucgwk2.jpg 26
 
0.8%
horpg5cskmeqlaolx3bkmrkgfi.jpg 26
 
0.8%
2p0hnrygkdvirv8rcdt1rbsjdbj.jpg 25
 
0.8%
j5te0ynzamxdbnsqtudkibet8iu.jpg 20
 
0.6%
scvws6k8giw8w24ucmepqqvl10l.jpg 16
 
0.5%
esddu6pbocmayu1sxqfu9vyyoq6.jpg 15
 
0.5%
2vmz1zrfpnuqtqp5k4wrxvdybjh.jpg 15
 
0.5%
n1bjdbvthbezxr6nef2dy43sttv.jpg 14
 
0.4%
Other values (1068) 2950
93.3%

Most occurring characters

ValueCountFrequency (%)
g 4776
 
4.7%
p 4566
 
4.5%
j 4484
 
4.4%
/ 3163
 
3.1%
. 3163
 
3.1%
m 1591
 
1.6%
d 1556
 
1.5%
C 1530
 
1.5%
5 1522
 
1.5%
k 1512
 
1.5%
Other values (54) 73220
72.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 46019
45.5%
Uppercase Letter 34963
34.6%
Decimal Number 13775
 
13.6%
Other Punctuation 6326
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
g 4776
 
10.4%
p 4566
 
9.9%
j 4484
 
9.7%
m 1591
 
3.5%
d 1556
 
3.4%
k 1512
 
3.3%
c 1483
 
3.2%
i 1474
 
3.2%
f 1466
 
3.2%
l 1460
 
3.2%
Other values (16) 21651
47.0%
Uppercase Letter
ValueCountFrequency (%)
C 1530
 
4.4%
U 1488
 
4.3%
Q 1462
 
4.2%
F 1425
 
4.1%
D 1413
 
4.0%
K 1412
 
4.0%
Y 1404
 
4.0%
S 1403
 
4.0%
J 1389
 
4.0%
X 1386
 
4.0%
Other values (16) 20651
59.1%
Decimal Number
ValueCountFrequency (%)
5 1522
11.0%
2 1429
10.4%
1 1404
10.2%
4 1394
10.1%
9 1383
10.0%
3 1370
9.9%
7 1360
9.9%
6 1340
9.7%
8 1339
9.7%
0 1234
9.0%
Other Punctuation
ValueCountFrequency (%)
/ 3163
50.0%
. 3163
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 80982
80.1%
Common 20101
 
19.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
g 4776
 
5.9%
p 4566
 
5.6%
j 4484
 
5.5%
m 1591
 
2.0%
d 1556
 
1.9%
C 1530
 
1.9%
k 1512
 
1.9%
U 1488
 
1.8%
c 1483
 
1.8%
i 1474
 
1.8%
Other values (42) 56522
69.8%
Common
ValueCountFrequency (%)
/ 3163
15.7%
. 3163
15.7%
5 1522
7.6%
2 1429
7.1%
1 1404
7.0%
4 1394
6.9%
9 1383
6.9%
3 1370
6.8%
7 1360
6.8%
6 1340
6.7%
Other values (2) 2573
12.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101083
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
g 4776
 
4.7%
p 4566
 
4.5%
j 4484
 
4.4%
/ 3163
 
3.1%
. 3163
 
3.1%
m 1591
 
1.6%
d 1556
 
1.5%
C 1530
 
1.5%
5 1522
 
1.5%
k 1512
 
1.5%
Other values (54) 73220
72.4%

backdrop_btc
Categorical

HIGH CARDINALITY  MISSING 

Distinct1077
Distinct (%)34.0%
Missing42183
Missing (%)93.0%
Memory size354.4 KiB
/foe3kuiJmg5AklhtD3skWbaTMf2.jpg
 
29
/jaUuprubvAxXLAY5hUfrNjxccUh.jpg
 
27
/bY8gLImMR5Pr9PaG3ZpobfaAQ8N.jpg
 
26
/6VcVl48kNKvdXOZfJPdarlUGOsk.jpg
 
26
/38tF1LJN7ULeZAuAfP7beaPMfcl.jpg
 
25
Other values (1072)
3030 

Length

Max length32
Median length32
Mean length31.976288
Min length31

Characters and Unicode

Total characters101141
Distinct characters64
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique170 ?
Unique (%)5.4%

Sample

1st row/9FBwqcd9IRruEDUrTdcaafOMKUq.jpg
2nd row/hypTnLot2z8wpFS7qwsQHW1uV8u.jpg
3rd row/7qwE57OVZmMJChBpLEbJEmzUydk.jpg
4th row/6VcVl48kNKvdXOZfJPdarlUGOsk.jpg
5th row/9VM5LiJV0bGb1st1KyHA3cVnO2G.jpg

Common Values

ValueCountFrequency (%)
/foe3kuiJmg5AklhtD3skWbaTMf2.jpg 29
 
0.1%
/jaUuprubvAxXLAY5hUfrNjxccUh.jpg 27
 
0.1%
/bY8gLImMR5Pr9PaG3ZpobfaAQ8N.jpg 26
 
0.1%
/6VcVl48kNKvdXOZfJPdarlUGOsk.jpg 26
 
0.1%
/38tF1LJN7ULeZAuAfP7beaPMfcl.jpg 25
 
0.1%
/iGoYKA0TFfgSoZpG2u5viTJMGfK.jpg 20
 
< 0.1%
/dx9YSup5zEOjxYwG4UkYBVAZIXo.jpg 16
 
< 0.1%
/9bE62qBanBFtoiIc9cXjk1xW3w.jpg 15
 
< 0.1%
/7PcbijxTfwi9vjWEfXdS0ReAw8q.jpg 15
 
< 0.1%
/alkvR9vTtuZEmd5ygsayOfxYOMa.jpg 14
 
< 0.1%
Other values (1067) 2950
 
6.5%
(Missing) 42183
93.0%

Length

2023-07-01T07:48:03.518753image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
foe3kuijmg5aklhtd3skwbatmf2.jpg 29
 
0.9%
jauuprubvaxxlay5hufrnjxccuh.jpg 27
 
0.9%
by8glimmr5pr9pag3zpobfaaq8n.jpg 26
 
0.8%
6vcvl48knkvdxozfjpdarlugosk.jpg 26
 
0.8%
38tf1ljn7ulezauafp7beapmfcl.jpg 25
 
0.8%
igoyka0tffgsozpg2u5vitjmgfk.jpg 20
 
0.6%
dx9ysup5zeojxywg4ukybvazixo.jpg 16
 
0.5%
9be62qbanbftoiic9cxjk1xw3w.jpg 15
 
0.5%
7pcbijxtfwi9vjwefxds0reaw8q.jpg 15
 
0.5%
alkvr9vttuzemd5ygsayofxyoma.jpg 14
 
0.4%
Other values (1067) 2950
93.3%

Most occurring characters

ValueCountFrequency (%)
p 4574
 
4.5%
j 4560
 
4.5%
g 4546
 
4.5%
/ 3163
 
3.1%
. 3163
 
3.1%
k 1685
 
1.7%
c 1664
 
1.6%
f 1616
 
1.6%
u 1541
 
1.5%
8 1530
 
1.5%
Other values (54) 73099
72.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 45998
45.5%
Uppercase Letter 34678
34.3%
Decimal Number 14139
 
14.0%
Other Punctuation 6326
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
p 4574
 
9.9%
j 4560
 
9.9%
g 4546
 
9.9%
k 1685
 
3.7%
c 1664
 
3.6%
f 1616
 
3.5%
u 1541
 
3.4%
a 1512
 
3.3%
i 1470
 
3.2%
b 1470
 
3.2%
Other values (16) 21360
46.4%
Uppercase Letter
ValueCountFrequency (%)
A 1485
 
4.3%
Z 1482
 
4.3%
T 1450
 
4.2%
U 1427
 
4.1%
Y 1409
 
4.1%
N 1408
 
4.1%
K 1405
 
4.1%
M 1396
 
4.0%
L 1383
 
4.0%
G 1382
 
4.0%
Other values (16) 20451
59.0%
Decimal Number
ValueCountFrequency (%)
8 1530
10.8%
9 1492
10.6%
5 1452
10.3%
7 1432
10.1%
2 1426
10.1%
1 1421
10.1%
3 1405
9.9%
0 1399
9.9%
6 1330
9.4%
4 1252
8.9%
Other Punctuation
ValueCountFrequency (%)
/ 3163
50.0%
. 3163
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 80676
79.8%
Common 20465
 
20.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
p 4574
 
5.7%
j 4560
 
5.7%
g 4546
 
5.6%
k 1685
 
2.1%
c 1664
 
2.1%
f 1616
 
2.0%
u 1541
 
1.9%
a 1512
 
1.9%
A 1485
 
1.8%
Z 1482
 
1.8%
Other values (42) 56011
69.4%
Common
ValueCountFrequency (%)
/ 3163
15.5%
. 3163
15.5%
8 1530
7.5%
9 1492
7.3%
5 1452
7.1%
7 1432
7.0%
2 1426
7.0%
1 1421
6.9%
3 1405
6.9%
0 1399
6.8%
Other values (2) 2582
12.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101141
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
p 4574
 
4.5%
j 4560
 
4.5%
g 4546
 
4.5%
/ 3163
 
3.1%
. 3163
 
3.1%
k 1685
 
1.7%
c 1664
 
1.6%
f 1616
 
1.6%
u 1541
 
1.5%
8 1530
 
1.5%
Other values (54) 73099
72.3%

iso_639_1
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct1916
Distinct (%)4.6%
Missing3792
Missing (%)8.4%
Memory size354.4 KiB
en
22366 
fr
 
1850
ja
 
1287
it
 
1217
es
 
901
Other values (1911)
13933 

Length

Max length38
Median length2
Mean length2.8379699
Min length2

Characters and Unicode

Total characters117929
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1358 ?
Unique (%)3.3%

Sample

1st rowen
2nd rowen,fr
3rd rowen
4th rowen
5th rowen

Common Values

ValueCountFrequency (%)
en 22366
49.3%
fr 1850
 
4.1%
ja 1287
 
2.8%
it 1217
 
2.7%
es 901
 
2.0%
ru 807
 
1.8%
de 760
 
1.7%
en,fr 681
 
1.5%
en,es 572
 
1.3%
hi 480
 
1.1%
Other values (1906) 10633
23.4%
(Missing) 3792
 
8.4%

Length

2023-07-01T07:48:03.987443image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
en 22366
53.8%
fr 1850
 
4.5%
ja 1287
 
3.1%
it 1217
 
2.9%
es 901
 
2.2%
ru 807
 
1.9%
de 760
 
1.8%
en,fr 681
 
1.6%
en,es 572
 
1.4%
hi 480
 
1.2%
Other values (1906) 10633
25.6%

Most occurring characters

ValueCountFrequency (%)
e 34323
29.1%
n 29775
25.2%
, 11607
 
9.8%
r 6717
 
5.7%
f 4729
 
4.0%
i 3684
 
3.1%
t 3680
 
3.1%
s 3621
 
3.1%
d 2983
 
2.5%
a 2944
 
2.5%
Other values (17) 13866
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 106322
90.2%
Other Punctuation 11607
 
9.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 34323
32.3%
n 29775
28.0%
r 6717
 
6.3%
f 4729
 
4.4%
i 3684
 
3.5%
t 3680
 
3.5%
s 3621
 
3.4%
d 2983
 
2.8%
a 2944
 
2.8%
h 2351
 
2.2%
Other values (16) 11515
 
10.8%
Other Punctuation
ValueCountFrequency (%)
, 11607
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 106322
90.2%
Common 11607
 
9.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 34323
32.3%
n 29775
28.0%
r 6717
 
6.3%
f 4729
 
4.4%
i 3684
 
3.5%
t 3680
 
3.5%
s 3621
 
3.4%
d 2983
 
2.8%
a 2944
 
2.8%
h 2351
 
2.2%
Other values (16) 11515
 
10.8%
Common
ValueCountFrequency (%)
, 11607
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 117929
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 34323
29.1%
n 29775
25.2%
, 11607
 
9.8%
r 6717
 
5.7%
f 4729
 
4.0%
i 3684
 
3.1%
t 3680
 
3.1%
s 3621
 
3.1%
d 2983
 
2.5%
a 2944
 
2.5%
Other values (17) 13866
11.8%

language_name
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct1827
Distinct (%)4.4%
Missing3915
Missing (%)8.6%
Memory size354.4 KiB
English
22366 
Français
 
1850
日本語
 
1287
Italiano
 
1217
Español
 
901
Other values (1822)
13810 

Length

Max length101
Median length7
Mean length9.0720234
Min length1

Characters and Unicode

Total characters375863
Distinct characters169
Distinct categories6 ?
Distinct scripts15 ?
Distinct blocks16 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1285 ?
Unique (%)3.1%

Sample

1st rowEnglish
2nd rowEnglish,Français
3rd rowEnglish
4th rowEnglish
5th rowEnglish

Common Values

ValueCountFrequency (%)
English 22366
49.3%
Français 1850
 
4.1%
日本語 1287
 
2.8%
Italiano 1217
 
2.7%
Español 901
 
2.0%
Pусский 807
 
1.8%
Deutsch 760
 
1.7%
English,Français 681
 
1.5%
English,Español 572
 
1.3%
हिन्दी 480
 
1.1%
Other values (1817) 10510
23.2%
(Missing) 3915
 
8.6%

Length

2023-07-01T07:48:04.534503image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
english 22447
54.2%
français 1857
 
4.5%
日本語 1288
 
3.1%
italiano 1219
 
2.9%
español 912
 
2.2%
pусский 813
 
2.0%
deutsch 764
 
1.8%
english,français 689
 
1.7%
english,español 576
 
1.4%
हिन्दी 488
 
1.2%
Other values (1705) 10378
25.0%

Most occurring characters

ValueCountFrequency (%)
s 42209
11.2%
n 37415
 
10.0%
i 36983
 
9.8%
l 34590
 
9.2%
h 31428
 
8.4%
E 31167
 
8.3%
g 30383
 
8.1%
a 18889
 
5.0%
, 11607
 
3.1%
o 7038
 
1.9%
Other values (159) 94154
25.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 291316
77.5%
Uppercase Letter 46331
 
12.3%
Other Letter 22160
 
5.9%
Other Punctuation 12672
 
3.4%
Spacing Mark 1836
 
0.5%
Nonspacing Mark 1548
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 42209
14.5%
n 37415
12.8%
i 36983
12.7%
l 34590
11.9%
h 31428
10.8%
g 30383
10.4%
a 18889
6.5%
o 7038
 
2.4%
r 6115
 
2.1%
t 5943
 
2.0%
Other values (63) 40323
13.8%
Other Letter
ValueCountFrequency (%)
1755
 
7.9%
1755
 
7.9%
1755
 
7.9%
1263
 
5.7%
946
 
4.3%
790
 
3.6%
790
 
3.6%
706
 
3.2%
706
 
3.2%
706
 
3.2%
Other values (46) 10988
49.6%
Uppercase Letter
ValueCountFrequency (%)
E 31167
67.3%
F 4189
 
9.0%
D 2921
 
6.3%
P 2662
 
5.7%
I 2364
 
5.1%
N 826
 
1.8%
L 479
 
1.0%
M 360
 
0.8%
T 307
 
0.7%
Č 281
 
0.6%
Other values (13) 775
 
1.7%
Spacing Mark
ValueCountFrequency (%)
ि 706
38.5%
706
38.5%
136
 
7.4%
ி 111
 
6.0%
94
 
5.1%
47
 
2.6%
18
 
1.0%
18
 
1.0%
Nonspacing Mark
ValueCountFrequency (%)
706
45.6%
ִ 430
27.8%
ְ 215
 
13.9%
111
 
7.2%
68
 
4.4%
18
 
1.2%
Other Punctuation
ValueCountFrequency (%)
, 11607
91.6%
/ 1015
 
8.0%
? 50
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 325339
86.6%
Common 12672
 
3.4%
Han 10473
 
2.8%
Cyrillic 10381
 
2.8%
Devanagari 4236
 
1.1%
Arabic 3332
 
0.9%
Hangul 3252
 
0.9%
Hebrew 1720
 
0.5%
Greek 1696
 
0.5%
Thai 1225
 
0.3%
Other values (5) 1537
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 42209
13.0%
n 37415
11.5%
i 36983
11.4%
l 34590
10.6%
h 31428
9.7%
E 31167
9.6%
g 30383
9.3%
a 18889
 
5.8%
o 7038
 
2.2%
r 6115
 
1.9%
Other values (50) 49122
15.1%
Cyrillic
ValueCountFrequency (%)
с 3190
30.7%
к 1722
16.6%
и 1667
16.1%
й 1605
15.5%
у 1554
15.0%
а 112
 
1.1%
р 86
 
0.8%
ь 53
 
0.5%
ї 53
 
0.5%
н 53
 
0.5%
Other values (12) 286
 
2.8%
Arabic
ValueCountFrequency (%)
ر 535
16.1%
ا 535
16.1%
ة 340
10.2%
ي 340
10.2%
ب 340
10.2%
ع 340
10.2%
ل 340
10.2%
س 140
 
4.2%
ف 140
 
4.2%
ی 140
 
4.2%
Other values (5) 142
 
4.3%
Han
ValueCountFrequency (%)
1755
16.8%
1755
16.8%
1755
16.8%
1263
12.1%
946
9.0%
790
7.5%
790
7.5%
广 473
 
4.5%
473
 
4.5%
473
 
4.5%
Hebrew
ValueCountFrequency (%)
ִ 430
25.0%
ְ 215
12.5%
ת 215
12.5%
י 215
12.5%
ר 215
12.5%
ב 215
12.5%
ע 215
12.5%
Greek
ValueCountFrequency (%)
λ 424
25.0%
ά 212
12.5%
κ 212
12.5%
ν 212
12.5%
ι 212
12.5%
η 212
12.5%
ε 212
12.5%
Georgian
ValueCountFrequency (%)
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
Devanagari
ValueCountFrequency (%)
ि 706
16.7%
706
16.7%
706
16.7%
706
16.7%
706
16.7%
706
16.7%
Hangul
ValueCountFrequency (%)
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
Thai
ValueCountFrequency (%)
350
28.6%
175
14.3%
175
14.3%
175
14.3%
175
14.3%
175
14.3%
Gurmukhi
ValueCountFrequency (%)
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
Telugu
ValueCountFrequency (%)
136
33.3%
68
16.7%
68
16.7%
68
16.7%
68
16.7%
Tamil
ValueCountFrequency (%)
111
20.0%
111
20.0%
111
20.0%
ி 111
20.0%
111
20.0%
Bengali
ValueCountFrequency (%)
94
40.0%
47
20.0%
47
20.0%
47
20.0%
Common
ValueCountFrequency (%)
, 11607
91.6%
/ 1015
 
8.0%
? 50
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 329202
87.6%
CJK 10473
 
2.8%
Cyrillic 10381
 
2.8%
None 10379
 
2.8%
Devanagari 4236
 
1.1%
Arabic 3332
 
0.9%
Hangul 3252
 
0.9%
Hebrew 1720
 
0.5%
Thai 1225
 
0.3%
Tamil 555
 
0.1%
Other values (6) 1108
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 42209
12.8%
n 37415
11.4%
i 36983
11.2%
l 34590
10.5%
h 31428
9.5%
E 31167
9.5%
g 30383
9.2%
a 18889
 
5.7%
, 11607
 
3.5%
o 7038
 
2.1%
Other values (37) 47493
14.4%
None
ValueCountFrequency (%)
ç 4433
42.7%
ñ 2410
23.2%
ê 590
 
5.7%
λ 424
 
4.1%
Č 281
 
2.7%
ý 281
 
2.7%
ü 246
 
2.4%
ά 212
 
2.0%
κ 212
 
2.0%
ν 212
 
2.0%
Other values (10) 1078
 
10.4%
Cyrillic
ValueCountFrequency (%)
с 3190
30.7%
к 1722
16.6%
и 1667
16.1%
й 1605
15.5%
у 1554
15.0%
а 112
 
1.1%
р 86
 
0.8%
ь 53
 
0.5%
ї 53
 
0.5%
н 53
 
0.5%
Other values (12) 286
 
2.8%
CJK
ValueCountFrequency (%)
1755
16.8%
1755
16.8%
1755
16.8%
1263
12.1%
946
9.0%
790
7.5%
790
7.5%
广 473
 
4.5%
473
 
4.5%
473
 
4.5%
Devanagari
ValueCountFrequency (%)
ि 706
16.7%
706
16.7%
706
16.7%
706
16.7%
706
16.7%
706
16.7%
Hangul
ValueCountFrequency (%)
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
542
16.7%
Arabic
ValueCountFrequency (%)
ر 535
16.1%
ا 535
16.1%
ة 340
10.2%
ي 340
10.2%
ب 340
10.2%
ع 340
10.2%
ل 340
10.2%
س 140
 
4.2%
ف 140
 
4.2%
ی 140
 
4.2%
Other values (5) 142
 
4.3%
Hebrew
ValueCountFrequency (%)
ִ 430
25.0%
ְ 215
12.5%
ת 215
12.5%
י 215
12.5%
ר 215
12.5%
ב 215
12.5%
ע 215
12.5%
Thai
ValueCountFrequency (%)
350
28.6%
175
14.3%
175
14.3%
175
14.3%
175
14.3%
175
14.3%
Telugu
ValueCountFrequency (%)
136
33.3%
68
16.7%
68
16.7%
68
16.7%
68
16.7%
Tamil
ValueCountFrequency (%)
111
20.0%
111
20.0%
111
20.0%
ி 111
20.0%
111
20.0%
Bengali
ValueCountFrequency (%)
94
40.0%
47
20.0%
47
20.0%
47
20.0%
Latin Ext Additional
ValueCountFrequency (%)
ế 61
50.0%
61
50.0%
Georgian
ValueCountFrequency (%)
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
33
14.3%
Gurmukhi
ValueCountFrequency (%)
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
18
16.7%
IPA Ext
ValueCountFrequency (%)
ə 4
100.0%

release_year
Real number (ℝ)

Distinct135
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1991.8828
Minimum1874
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size354.4 KiB
2023-07-01T07:48:05.083187image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1874
5-th percentile1941
Q11978
median2001
Q32010
95-th percentile2015
Maximum2020
Range146
Interquartile range (IQR)32

Descriptive statistics

Standard deviation24.05304
Coefficient of variation (CV)0.01207553
Kurtosis0.84037057
Mean1991.8828
Median Absolute Deviation (MAD)12
Skewness-1.2247867
Sum90323919
Variance578.54874
MonotonicityNot monotonic
2023-07-01T07:48:05.583731image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014 1973
 
4.4%
2015 1904
 
4.2%
2013 1887
 
4.2%
2012 1721
 
3.8%
2011 1666
 
3.7%
2016 1604
 
3.5%
2009 1585
 
3.5%
2010 1501
 
3.3%
2008 1470
 
3.2%
2007 1319
 
2.9%
Other values (125) 28716
63.3%
ValueCountFrequency (%)
1874 1
 
< 0.1%
1878 1
 
< 0.1%
1883 1
 
< 0.1%
1887 1
 
< 0.1%
1888 2
 
< 0.1%
1890 5
 
< 0.1%
1891 6
< 0.1%
1892 3
 
< 0.1%
1893 1
 
< 0.1%
1894 13
< 0.1%
ValueCountFrequency (%)
2020 1
 
< 0.1%
2018 5
 
< 0.1%
2017 532
 
1.2%
2016 1604
3.5%
2015 1904
4.2%
2014 1973
4.4%
2013 1887
4.2%
2012 1721
3.8%
2011 1666
3.7%
2010 1501
3.3%

return
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct1256
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean660.47917
Minimum0
Maximum12396383
Zeros40033
Zeros (%)88.3%
Negative0
Negative (%)0.0%
Memory size354.4 KiB
2023-07-01T07:48:06.130586image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2.5375
Maximum12396383
Range12396383
Interquartile range (IQR)0

Descriptive statistics

Standard deviation74717.996
Coefficient of variation (CV)113.12695
Kurtosis20659.288
Mean660.47917
Median Absolute Deviation (MAD)0
Skewness138.28379
Sum29950088
Variance5.582779 × 109
MonotonicityNot monotonic
2023-07-01T07:48:06.661840image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 40033
88.3%
0.01 64
 
0.1%
0.02 38
 
0.1%
1 34
 
0.1%
0.08 29
 
0.1%
0.06 27
 
0.1%
0.62 25
 
0.1%
0.03 24
 
0.1%
1.1 23
 
0.1%
1.2 23
 
0.1%
Other values (1246) 5026
 
11.1%
ValueCountFrequency (%)
0 40033
88.3%
0.01 64
 
0.1%
0.02 38
 
0.1%
0.03 24
 
0.1%
0.04 19
 
< 0.1%
0.05 22
 
< 0.1%
0.06 27
 
0.1%
0.07 18
 
< 0.1%
0.08 29
 
0.1%
0.09 16
 
< 0.1%
ValueCountFrequency (%)
12396383 1
< 0.1%
8500000 1
< 0.1%
4197476.62 1
< 0.1%
2755584 1
< 0.1%
1018619.28 1
< 0.1%
1000000 1
< 0.1%
26881.72 1
< 0.1%
12890.39 1
< 0.1%
5330.34 1
< 0.1%
4133.33 1
< 0.1%

companies_id
Categorical

HIGH CARDINALITY  MISSING 

Distinct22290
Distinct (%)67.4%
Missing12264
Missing (%)27.0%
Memory size354.4 KiB
8411
 
742
6194
 
540
4
 
504
306
 
439
33
 
320
Other values (22285)
30537 

Length

Max length146
Median length124
Mean length9.5989964
Min length1

Characters and Unicode

Total characters317554
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19970 ?
Unique (%)60.4%

Sample

1st row3
2nd row559,2550,10201
3rd row6194,19464
4th row306
5th row5842,9195

Common Values

ValueCountFrequency (%)
8411 742
 
1.6%
6194 540
 
1.2%
4 504
 
1.1%
306 439
 
1.0%
33 320
 
0.7%
6 247
 
0.5%
441 207
 
0.5%
5 146
 
0.3%
5120 145
 
0.3%
2 85
 
0.2%
Other values (22280) 29707
65.5%
(Missing) 12264
27.0%

Length

2023-07-01T07:48:07.286757image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
8411 742
 
2.2%
6194 540
 
1.6%
4 504
 
1.5%
306 439
 
1.3%
33 320
 
1.0%
6 247
 
0.7%
441 207
 
0.6%
5 146
 
0.4%
5120 145
 
0.4%
2 85
 
0.3%
Other values (22280) 29707
89.8%

Most occurring characters

ValueCountFrequency (%)
1 42863
13.5%
, 35364
11.1%
2 31502
9.9%
3 30318
9.5%
4 29421
9.3%
6 27019
8.5%
5 26750
8.4%
8 24769
7.8%
7 23568
7.4%
9 23406
7.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 282190
88.9%
Other Punctuation 35364
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 42863
15.2%
2 31502
11.2%
3 30318
10.7%
4 29421
10.4%
6 27019
9.6%
5 26750
9.5%
8 24769
8.8%
7 23568
8.4%
9 23406
8.3%
0 22574
8.0%
Other Punctuation
ValueCountFrequency (%)
, 35364
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 317554
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 42863
13.5%
, 35364
11.1%
2 31502
9.9%
3 30318
9.5%
4 29421
9.3%
6 27019
8.5%
5 26750
8.4%
8 24769
7.8%
7 23568
7.4%
9 23406
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 317554
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 42863
13.5%
, 35364
11.1%
2 31502
9.9%
3 30318
9.5%
4 29421
9.3%
6 27019
8.5%
5 26750
8.4%
8 24769
7.8%
7 23568
7.4%
9 23406
7.4%

companies_name
Categorical

HIGH CARDINALITY  MISSING 

Distinct22240
Distinct (%)67.2%
Missing12264
Missing (%)27.0%
Memory size354.4 KiB
Metro-Goldwyn-Mayer(MGM)
 
742
WarnerBros.
 
540
ParamountPictures
 
504
TwentiethCenturyFoxFilmCorporation
 
439
UniversalPictures
 
320
Other values (22235)
30537 

Length

Max length531
Median length313
Mean length36.536394
Min length2

Characters and Unicode

Total characters1208697
Distinct characters288
Distinct categories15 ?
Distinct scripts6 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19909 ?
Unique (%)60.2%

Sample

1st rowPixarAnimationStudios
2nd rowTriStarPictures,TeitlerFilm,InterscopeCommunications
3rd rowWarnerBros.,LancasterGate
4th rowTwentiethCenturyFoxFilmCorporation
5th rowSandollarProductions,TouchstonePictures

Common Values

ValueCountFrequency (%)
Metro-Goldwyn-Mayer(MGM) 742
 
1.6%
WarnerBros. 540
 
1.2%
ParamountPictures 504
 
1.1%
TwentiethCenturyFoxFilmCorporation 439
 
1.0%
UniversalPictures 320
 
0.7%
RKORadioPictures 247
 
0.5%
ColumbiaPicturesCorporation 207
 
0.5%
ColumbiaPictures 146
 
0.3%
Mosfilm 145
 
0.3%
WaltDisneyPictures 85
 
0.2%
Other values (22230) 29707
65.5%
(Missing) 12264
27.0%

Length

2023-07-01T07:48:07.897483image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
metro-goldwyn-mayer(mgm 742
 
2.2%
warnerbros 540
 
1.6%
paramountpictures 504
 
1.5%
twentiethcenturyfoxfilmcorporation 439
 
1.3%
universalpictures 320
 
1.0%
rkoradiopictures 247
 
0.7%
columbiapicturescorporation 207
 
0.6%
columbiapictures 146
 
0.4%
mosfilm 145
 
0.4%
waltdisneypictures 85
 
0.3%
Other values (22189) 29707
89.8%

Most occurring characters

ValueCountFrequency (%)
i 103753
 
8.6%
e 91275
 
7.6%
n 87195
 
7.2%
o 82733
 
6.8%
r 81320
 
6.7%
t 81276
 
6.7%
a 74646
 
6.2%
s 60744
 
5.0%
l 49324
 
4.1%
m 42956
 
3.6%
Other values (278) 453475
37.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 956127
79.1%
Uppercase Letter 192473
 
15.9%
Other Punctuation 42741
 
3.5%
Decimal Number 4154
 
0.3%
Dash Punctuation 4149
 
0.3%
Open Punctuation 4140
 
0.3%
Close Punctuation 4139
 
0.3%
Math Symbol 594
 
< 0.1%
Other Letter 140
 
< 0.1%
Other Symbol 25
 
< 0.1%
Other values (5) 15
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 103753
10.9%
e 91275
9.5%
n 87195
9.1%
o 82733
8.7%
r 81320
8.5%
t 81276
8.5%
a 74646
 
7.8%
s 60744
 
6.4%
l 49324
 
5.2%
m 42956
 
4.5%
Other values (102) 200905
21.0%
Other Letter
ValueCountFrequency (%)
9
 
6.4%
8
 
5.7%
6
 
4.3%
5
 
3.6%
5
 
3.6%
5
 
3.6%
5
 
3.6%
5
 
3.6%
4
 
2.9%
3
 
2.1%
Other values (62) 85
60.7%
Uppercase Letter
ValueCountFrequency (%)
P 27288
14.2%
F 25447
13.2%
C 19698
 
10.2%
M 12962
 
6.7%
S 11571
 
6.0%
E 9465
 
4.9%
A 9161
 
4.8%
T 9055
 
4.7%
B 8751
 
4.5%
G 7590
 
3.9%
Other values (52) 51485
26.7%
Other Punctuation
ValueCountFrequency (%)
, 35757
83.7%
. 5536
 
13.0%
& 744
 
1.7%
/ 625
 
1.5%
! 36
 
0.1%
% 17
 
< 0.1%
: 9
 
< 0.1%
@ 5
 
< 0.1%
; 3
 
< 0.1%
# 3
 
< 0.1%
Other values (4) 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 993
23.9%
1 676
16.3%
0 629
15.1%
3 519
12.5%
4 461
11.1%
9 198
 
4.8%
6 192
 
4.6%
7 170
 
4.1%
8 159
 
3.8%
5 157
 
3.8%
Open Punctuation
ValueCountFrequency (%)
( 4130
99.8%
[ 9
 
0.2%
1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 4129
99.8%
] 9
 
0.2%
1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 4147
> 99.9%
2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 593
99.8%
| 1
 
0.2%
Other Symbol
ValueCountFrequency (%)
° 23
92.0%
2
 
8.0%
Final Punctuation
ValueCountFrequency (%)
3
50.0%
» 3
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Initial Punctuation
ValueCountFrequency (%)
« 3
100.0%
Other Number
ValueCountFrequency (%)
² 1
100.0%
Format
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1148197
95.0%
Common 59955
 
5.0%
Cyrillic 373
 
< 0.1%
Hangul 115
 
< 0.1%
Greek 31
 
< 0.1%
Han 26
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 103753
 
9.0%
e 91275
 
7.9%
n 87195
 
7.6%
o 82733
 
7.2%
r 81320
 
7.1%
t 81276
 
7.1%
a 74646
 
6.5%
s 60744
 
5.3%
l 49324
 
4.3%
m 42956
 
3.7%
Other values (99) 392975
34.2%
Hangul
ValueCountFrequency (%)
9
 
7.8%
8
 
7.0%
6
 
5.2%
5
 
4.3%
5
 
4.3%
5
 
4.3%
5
 
4.3%
5
 
4.3%
4
 
3.5%
3
 
2.6%
Other values (43) 60
52.2%
Cyrillic
ValueCountFrequency (%)
и 34
 
9.1%
о 28
 
7.5%
а 26
 
7.0%
л 22
 
5.9%
н 20
 
5.4%
м 19
 
5.1%
т 17
 
4.6%
ь 16
 
4.3%
с 16
 
4.3%
е 16
 
4.3%
Other values (36) 159
42.6%
Common
ValueCountFrequency (%)
, 35757
59.6%
. 5536
 
9.2%
- 4147
 
6.9%
( 4130
 
6.9%
) 4129
 
6.9%
2 993
 
1.7%
& 744
 
1.2%
1 676
 
1.1%
0 629
 
1.0%
/ 625
 
1.0%
Other values (31) 2589
 
4.3%
Greek
ValueCountFrequency (%)
ο 3
 
9.7%
ν 3
 
9.7%
ρ 2
 
6.5%
τ 2
 
6.5%
Κ 2
 
6.5%
ι 2
 
6.5%
η 2
 
6.5%
λ 2
 
6.5%
Ε 2
 
6.5%
ό 1
 
3.2%
Other values (10) 10
32.3%
Han
ValueCountFrequency (%)
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
1
 
3.8%
1
 
3.8%
1
 
3.8%
Other values (9) 9
34.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1203076
99.5%
None 5102
 
0.4%
Cyrillic 373
 
< 0.1%
Hangul 113
 
< 0.1%
CJK 26
 
< 0.1%
Punctuation 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 103753
 
8.6%
e 91275
 
7.6%
n 87195
 
7.2%
o 82733
 
6.9%
r 81320
 
6.8%
t 81276
 
6.8%
a 74646
 
6.2%
s 60744
 
5.0%
l 49324
 
4.1%
m 42956
 
3.6%
Other values (73) 447854
37.2%
None
ValueCountFrequency (%)
é 2747
53.8%
ó 377
 
7.4%
á 301
 
5.9%
í 166
 
3.3%
ñ 146
 
2.9%
ü 143
 
2.8%
ä 133
 
2.6%
ö 127
 
2.5%
ô 127
 
2.5%
ç 118
 
2.3%
Other values (74) 717
 
14.1%
Cyrillic
ValueCountFrequency (%)
и 34
 
9.1%
о 28
 
7.5%
а 26
 
7.0%
л 22
 
5.9%
н 20
 
5.4%
м 19
 
5.1%
т 17
 
4.6%
ь 16
 
4.3%
с 16
 
4.3%
е 16
 
4.3%
Other values (36) 159
42.6%
Hangul
ValueCountFrequency (%)
9
 
8.0%
8
 
7.1%
6
 
5.3%
5
 
4.4%
5
 
4.4%
5
 
4.4%
5
 
4.4%
5
 
4.4%
4
 
3.5%
3
 
2.7%
Other values (42) 58
51.3%
Punctuation
ValueCountFrequency (%)
3
42.9%
2
28.6%
1
 
14.3%
1
 
14.3%
CJK
ValueCountFrequency (%)
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
2
 
7.7%
1
 
3.8%
1
 
3.8%
1
 
3.8%
Other values (9) 9
34.6%

countries_iso
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct2383
Distinct (%)6.1%
Missing6213
Missing (%)13.7%
Memory size354.4 KiB
US
17836 
GB
2235 
FR
 
1652
JP
 
1354
IT
 
1029
Other values (2378)
15027 

Length

Max length74
Median length2
Mean length2.782332
Min length2

Characters and Unicode

Total characters108881
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1764 ?
Unique (%)4.5%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS

Common Values

ValueCountFrequency (%)
US 17836
39.3%
GB 2235
 
4.9%
FR 1652
 
3.6%
JP 1354
 
3.0%
IT 1029
 
2.3%
CA 840
 
1.9%
DE 748
 
1.6%
IN 735
 
1.6%
RU 734
 
1.6%
GB,US 569
 
1.3%
Other values (2373) 11401
25.1%
(Missing) 6213
 
13.7%

Length

2023-07-01T07:48:08.444441image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
us 17836
45.6%
gb 2235
 
5.7%
fr 1652
 
4.2%
jp 1354
 
3.5%
it 1029
 
2.6%
ca 840
 
2.1%
de 748
 
1.9%
in 735
 
1.9%
ru 734
 
1.9%
gb,us 569
 
1.5%
Other values (2373) 11401
29.1%

Most occurring characters

ValueCountFrequency (%)
S 23026
21.1%
U 23009
21.1%
, 10205
9.4%
R 6674
 
6.1%
B 4977
 
4.6%
E 4743
 
4.4%
G 4445
 
4.1%
F 4329
 
4.0%
I 3999
 
3.7%
A 3130
 
2.9%
Other values (17) 20344
18.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 98676
90.6%
Other Punctuation 10205
 
9.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 23026
23.3%
U 23009
23.3%
R 6674
 
6.8%
B 4977
 
5.0%
E 4743
 
4.8%
G 4445
 
4.5%
F 4329
 
4.4%
I 3999
 
4.1%
A 3130
 
3.2%
T 3000
 
3.0%
Other values (16) 17344
17.6%
Other Punctuation
ValueCountFrequency (%)
, 10205
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 98676
90.6%
Common 10205
 
9.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 23026
23.3%
U 23009
23.3%
R 6674
 
6.8%
B 4977
 
5.0%
E 4743
 
4.8%
G 4445
 
4.5%
F 4329
 
4.4%
I 3999
 
4.1%
A 3130
 
3.2%
T 3000
 
3.0%
Other values (16) 17344
17.6%
Common
ValueCountFrequency (%)
, 10205
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 108881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 23026
21.1%
U 23009
21.1%
, 10205
9.4%
R 6674
 
6.1%
B 4977
 
4.6%
E 4743
 
4.4%
G 4445
 
4.1%
F 4329
 
4.0%
I 3999
 
3.7%
A 3130
 
2.9%
Other values (17) 20344
18.7%

countries_name
Categorical

HIGH CARDINALITY  IMBALANCE  MISSING 

Distinct2383
Distinct (%)6.1%
Missing6213
Missing (%)13.7%
Memory size354.4 KiB
UnitedStatesofAmerica
17836 
UnitedKingdom
2235 
France
 
1652
Japan
 
1354
Italy
 
1029
Other values (2378)
15027 

Length

Max length211
Median length148
Mean length17.008433
Min length4

Characters and Unicode

Total characters665591
Distinct characters51
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1764 ?
Unique (%)4.5%

Sample

1st rowUnitedStatesofAmerica
2nd rowUnitedStatesofAmerica
3rd rowUnitedStatesofAmerica
4th rowUnitedStatesofAmerica
5th rowUnitedStatesofAmerica

Common Values

ValueCountFrequency (%)
UnitedStatesofAmerica 17836
39.3%
UnitedKingdom 2235
 
4.9%
France 1652
 
3.6%
Japan 1354
 
3.0%
Italy 1029
 
2.3%
Canada 840
 
1.9%
Germany 748
 
1.6%
India 735
 
1.6%
Russia 734
 
1.6%
UnitedKingdom,UnitedStatesofAmerica 569
 
1.3%
Other values (2373) 11401
25.1%
(Missing) 6213
 
13.7%

Length

2023-07-01T07:48:09.007549image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unitedstatesofamerica 17836
45.6%
unitedkingdom 2235
 
5.7%
france 1652
 
4.2%
japan 1354
 
3.5%
italy 1029
 
2.6%
canada 840
 
2.1%
germany 748
 
1.9%
india 735
 
1.9%
russia 734
 
1.9%
unitedkingdom,unitedstatesofamerica 569
 
1.5%
Other values (2373) 11401
29.1%

Most occurring characters

ValueCountFrequency (%)
e 80562
12.1%
t 72563
 
10.9%
a 70400
 
10.6%
i 58494
 
8.8%
n 47439
 
7.1%
d 34515
 
5.2%
r 32443
 
4.9%
o 29543
 
4.4%
m 28675
 
4.3%
c 26338
 
4.0%
Other values (41) 184619
27.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 557931
83.8%
Uppercase Letter 97455
 
14.6%
Other Punctuation 10205
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 80562
14.4%
t 72563
13.0%
a 70400
12.6%
i 58494
10.5%
n 47439
8.5%
d 34515
6.2%
r 32443
5.8%
o 29543
 
5.3%
m 28675
 
5.1%
c 26338
 
4.7%
Other values (16) 76959
13.8%
Uppercase Letter
ValueCountFrequency (%)
U 25351
26.0%
S 23818
24.4%
A 22375
23.0%
K 5214
 
5.4%
F 4321
 
4.4%
I 3576
 
3.7%
C 2591
 
2.7%
G 2467
 
2.5%
J 1661
 
1.7%
R 1304
 
1.3%
Other values (14) 4777
 
4.9%
Other Punctuation
ValueCountFrequency (%)
, 10205
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 655386
98.5%
Common 10205
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 80562
12.3%
t 72563
11.1%
a 70400
10.7%
i 58494
 
8.9%
n 47439
 
7.2%
d 34515
 
5.3%
r 32443
 
5.0%
o 29543
 
4.5%
m 28675
 
4.4%
c 26338
 
4.0%
Other values (40) 174414
26.6%
Common
ValueCountFrequency (%)
, 10205
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 665591
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 80562
12.1%
t 72563
 
10.9%
a 70400
 
10.6%
i 58494
 
8.8%
n 47439
 
7.1%
d 34515
 
5.2%
r 32443
 
4.9%
o 29543
 
4.4%
m 28675
 
4.3%
c 26338
 
4.0%
Other values (41) 184619
27.7%

release_date
Categorical

Distinct17333
Distinct (%)38.2%
Missing0
Missing (%)0.0%
Memory size354.4 KiB
2008-01-01
 
136
2009-01-01
 
121
2007-01-01
 
117
2005-01-01
 
111
2006-01-01
 
101
Other values (17328)
44760 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters453460
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8579 ?
Unique (%)18.9%

Sample

1st row1995-10-30
2nd row1995-12-15
3rd row1995-12-22
4th row1995-12-22
5th row1995-02-10

Common Values

ValueCountFrequency (%)
2008-01-01 136
 
0.3%
2009-01-01 121
 
0.3%
2007-01-01 117
 
0.3%
2005-01-01 111
 
0.2%
2006-01-01 101
 
0.2%
2002-01-01 96
 
0.2%
2004-01-01 90
 
0.2%
2001-01-01 84
 
0.2%
2003-01-01 76
 
0.2%
1997-01-01 69
 
0.2%
Other values (17323) 44345
97.8%

Length

2023-07-01T07:48:09.478028image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2008-01-01 136
 
0.3%
2009-01-01 121
 
0.3%
2007-01-01 117
 
0.3%
2005-01-01 111
 
0.2%
2006-01-01 101
 
0.2%
2002-01-01 96
 
0.2%
2004-01-01 90
 
0.2%
2001-01-01 84
 
0.2%
2003-01-01 76
 
0.2%
1997-01-01 69
 
0.2%
Other values (17323) 44345
97.8%

Most occurring characters

ValueCountFrequency (%)
0 97532
21.5%
- 90692
20.0%
1 84002
18.5%
2 52761
11.6%
9 39752
8.8%
3 15418
 
3.4%
8 15269
 
3.4%
6 15010
 
3.3%
5 14828
 
3.3%
7 14282
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 362768
80.0%
Dash Punctuation 90692
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 97532
26.9%
1 84002
23.2%
2 52761
14.5%
9 39752
11.0%
3 15418
 
4.3%
8 15269
 
4.2%
6 15010
 
4.1%
5 14828
 
4.1%
7 14282
 
3.9%
4 13914
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 90692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 453460
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 97532
21.5%
- 90692
20.0%
1 84002
18.5%
2 52761
11.6%
9 39752
8.8%
3 15418
 
3.4%
8 15269
 
3.4%
6 15010
 
3.3%
5 14828
 
3.3%
7 14282
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 453460
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 97532
21.5%
- 90692
20.0%
1 84002
18.5%
2 52761
11.6%
9 39752
8.8%
3 15418
 
3.4%
8 15269
 
3.4%
6 15010
 
3.3%
5 14828
 
3.3%
7 14282
 
3.1%

month_time
Categorical

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size354.4 KiB
enero
5909 
septiembre
4834 
octubre
4613 
diciembre
3781 
noviembre
3661 
Other values (7)
22548 

Length

Max length10
Median length9
Mean length6.5277202
Min length4

Characters and Unicode

Total characters296006
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowoctubre
2nd rowdiciembre
3rd rowdiciembre
4th rowdiciembre
5th rowfebrero

Common Values

ValueCountFrequency (%)
enero 5909
13.0%
septiembre 4834
10.7%
octubre 4613
10.2%
diciembre 3781
8.3%
noviembre 3661
8.1%
marzo 3549
7.8%
abril 3452
7.6%
agosto 3393
7.5%
mayo 3337
7.4%
junio 3151
6.9%
Other values (2) 5666
12.5%

Length

2023-07-01T07:48:09.920336image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
enero 5909
13.0%
septiembre 4834
10.7%
octubre 4613
10.2%
diciembre 3781
8.3%
noviembre 3661
8.1%
marzo 3549
7.8%
abril 3452
7.6%
agosto 3393
7.5%
mayo 3337
7.4%
junio 3151
6.9%
Other values (2) 5666
12.5%

Most occurring characters

ValueCountFrequency (%)
e 51873
17.5%
o 36672
12.4%
r 35855
12.1%
i 25298
8.5%
b 23369
7.9%
m 19162
 
6.5%
a 13731
 
4.6%
t 12840
 
4.3%
n 12721
 
4.3%
u 10402
 
3.5%
Other values (11) 54083
18.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 296006
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 51873
17.5%
o 36672
12.4%
r 35855
12.1%
i 25298
8.5%
b 23369
7.9%
m 19162
 
6.5%
a 13731
 
4.6%
t 12840
 
4.3%
n 12721
 
4.3%
u 10402
 
3.5%
Other values (11) 54083
18.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 296006
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 51873
17.5%
o 36672
12.4%
r 35855
12.1%
i 25298
8.5%
b 23369
7.9%
m 19162
 
6.5%
a 13731
 
4.6%
t 12840
 
4.3%
n 12721
 
4.3%
u 10402
 
3.5%
Other values (11) 54083
18.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 296006
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 51873
17.5%
o 36672
12.4%
r 35855
12.1%
i 25298
8.5%
b 23369
7.9%
m 19162
 
6.5%
a 13731
 
4.6%
t 12840
 
4.3%
n 12721
 
4.3%
u 10402
 
3.5%
Other values (11) 54083
18.3%

day_time
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size354.4 KiB
viernes
13902 
jueves
7520 
miercoles
7027 
sabado
5149 
martes
4644 
Other values (2)
7104 

Length

Max length9
Median length7
Mean length6.7738941
Min length5

Characters and Unicode

Total characters307169
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowlunes
2nd rowviernes
3rd rowviernes
4th rowviernes
5th rowviernes

Common Values

ValueCountFrequency (%)
viernes 13902
30.7%
jueves 7520
16.6%
miercoles 7027
15.5%
sabado 5149
 
11.4%
martes 4644
 
10.2%
domingo 3607
 
8.0%
lunes 3497
 
7.7%

Length

2023-07-01T07:48:10.498313image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-01T07:48:11.045589image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
viernes 13902
30.7%
jueves 7520
16.6%
miercoles 7027
15.5%
sabado 5149
 
11.4%
martes 4644
 
10.2%
domingo 3607
 
8.0%
lunes 3497
 
7.7%

Most occurring characters

ValueCountFrequency (%)
e 65039
21.2%
s 41739
13.6%
r 25573
 
8.3%
i 24536
 
8.0%
v 21422
 
7.0%
n 21006
 
6.8%
o 19390
 
6.3%
m 15278
 
5.0%
a 14942
 
4.9%
u 11017
 
3.6%
Other values (7) 47227
15.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 307169
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 65039
21.2%
s 41739
13.6%
r 25573
 
8.3%
i 24536
 
8.0%
v 21422
 
7.0%
n 21006
 
6.8%
o 19390
 
6.3%
m 15278
 
5.0%
a 14942
 
4.9%
u 11017
 
3.6%
Other values (7) 47227
15.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 307169
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 65039
21.2%
s 41739
13.6%
r 25573
 
8.3%
i 24536
 
8.0%
v 21422
 
7.0%
n 21006
 
6.8%
o 19390
 
6.3%
m 15278
 
5.0%
a 14942
 
4.9%
u 11017
 
3.6%
Other values (7) 47227
15.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 307169
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 65039
21.2%
s 41739
13.6%
r 25573
 
8.3%
i 24536
 
8.0%
v 21422
 
7.0%
n 21006
 
6.8%
o 19390
 
6.3%
m 15278
 
5.0%
a 14942
 
4.9%
u 11017
 
3.6%
Other values (7) 47227
15.4%

Interactions

2023-07-01T07:47:40.232977image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:46:58.901968image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:03.422233image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:07.972188image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:12.678163image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:17.200530image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:21.600090image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:26.212864image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:31.101748image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:35.336204image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:40.640033image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:46:59.386863image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:03.844553image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:08.477797image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:13.100914image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:17.623025image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:22.023051image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:26.619751image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:31.492752image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:35.774148image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:41.078253image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:46:59.808835image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:04.251680image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:08.909305image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:13.554687image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:18.045646image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:22.461222image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:27.058365image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:31.930765image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:36.212063image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:41.550455image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:00.247024image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:04.689840image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:09.315900image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:13.993084image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:18.484640image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:22.915213image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:27.531125image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:32.354701image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:36.649926image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:42.017229image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:00.701210image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:05.159297image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:09.769721image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:14.446690image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:18.943031image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:23.385099image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:27.985088image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:32.792242image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:37.119148image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:42.472037image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:01.154698image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:05.602464image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:10.488460image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:14.900265image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:19.393478image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:23.831394image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:28.423486image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:33.198906image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:37.588990image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:42.941752image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:01.601443image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:06.071816image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:10.942366image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:15.385231image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:19.862711image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:24.332988image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:28.892696image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:33.636583image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:38.073632image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:43.380069image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:02.024131image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:06.494530image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:11.364777image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:15.823080image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:20.285277image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:24.804885image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:29.314785image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:34.067952image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:38.511931image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:43.810410image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:02.448072image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:06.933655image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:11.787328image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:16.261656image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:20.692310image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:25.289860image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:29.926593image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:34.474609image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:38.934636image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:44.288835image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:02.984490image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:07.485517image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:12.240493image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:16.747507image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:21.161555image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:25.759292image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:30.647801image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:34.913377image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-07-01T07:47:39.764015image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Correlations

2023-07-01T07:48:11.500917image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
idpopularityvote_averagevote_countruntimebudgetrevenueid_btcrelease_yearreturnstatusoriginal_languagemonth_timeday_time
id1.000-0.410-0.149-0.433-0.214-0.255-0.2780.4450.392-0.2630.0560.0710.0380.040
popularity-0.4101.0000.2410.8940.3150.4630.491-0.3090.1860.4460.0000.0000.0060.004
vote_average-0.1490.2411.0000.3170.1960.0720.127-0.027-0.0090.1210.0190.0700.0260.044
vote_count-0.4330.8940.3171.0000.2980.4840.513-0.3300.1970.4730.0000.0000.0210.029
runtime-0.2140.3150.1960.2981.0000.2290.255-0.1700.0320.2350.0000.1110.0260.028
budget-0.2550.4630.0720.4840.2291.0000.644-0.2900.1410.7710.0000.0000.0350.040
revenue-0.2780.4910.1270.5130.2550.6441.000-0.3200.1030.8490.0000.0000.0290.025
id_btc0.445-0.309-0.027-0.330-0.170-0.290-0.3201.0000.094-0.3070.0000.1700.0460.072
release_year0.3920.186-0.0090.1970.0320.1410.1030.0941.0000.0850.0280.1440.0440.082
return-0.2630.4460.1210.4730.2350.7710.849-0.3070.0851.0000.0000.0000.0060.000
status0.0560.0000.0190.0000.0000.0000.0000.0000.0280.0001.0000.0000.0050.004
original_language0.0710.0000.0700.0000.1110.0000.0000.1700.1440.0000.0001.0000.0450.171
month_time0.0380.0060.0260.0210.0260.0350.0290.0460.0440.0060.0050.0451.0000.048
day_time0.0400.0040.0440.0290.0280.0400.0250.0720.0820.0000.0040.1710.0481.000

Missing values

2023-07-01T07:47:45.353968image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
A simple visualization of nullity by column.
2023-07-01T07:47:47.326827image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-07-01T07:47:49.440737image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

idtitleoverviewpopularityvote_averagevote_countstatusoriginal_languageruntimebudgetrevenuetaglineid_btcname_btcposter_btcbackdrop_btciso_639_1language_namerelease_yearreturncompanies_idcompanies_namecountries_isocountries_namerelease_datemonth_timeday_time
0862Toy StoryLedbyWoody,Andy'stoyslivehappilyinhisroomuntilAndy'sbirthdaybringsBuzzLightyearontothescene.AfraidoflosinghisplaceinAndy'sheart,WoodyplotsagainstBuzz.ButwhencircumstancesseparateBuzzandWoodyfromtheirowner,theduoeventuallylearnstoputasidetheirdifferences.21.9469437.75415.0Releaseden8130000000.0373554033.0NaN10194.0ToyStoryCollection/7G9915LfUQ2lVfwMEEhDsn3kT4B.jpg/9FBwqcd9IRruEDUrTdcaafOMKUq.jpgenEnglish199512.453PixarAnimationStudiosUSUnitedStatesofAmerica1995-10-30octubrelunes
18844JumanjiWhensiblingsJudyandPeterdiscoveranenchantedboardgamethatopensthedoortoamagicalworld,theyunwittinglyinviteAlan--anadultwho'sbeentrappedinsidethegamefor26years--intotheirlivingroom.Alan'sonlyhopeforfreedomistofinishthegame,whichprovesriskyasallthreefindthemselvesrunningfromgiantrhinoceroses,evilmonkeysandotherterrifyingcreatures.17.0155396.92413.0Releaseden10465000000.0262797249.0Rollthediceandunleashtheexcitement!NaNNaNNaNNaNen,frEnglish,Français19954.04559,2550,10201TriStarPictures,TeitlerFilm,InterscopeCommunicationsUSUnitedStatesofAmerica1995-12-15diciembreviernes
215602Grumpier Old MenAfamilyweddingreignitestheancientfeudbetweennext-doorneighborsandfishingbuddiesJohnandMax.Meanwhile,asultryItaliandivorcéeopensarestaurantatthelocalbaitshop,alarmingthelocalswhoworryshe'llscarethefishaway.Butshe'slessinterestedinseafoodthansheisincookingupahottimewithMax.11.7129006.592.0Releaseden1010.00.0StillYelling.StillFighting.StillReadyforLove.119050.0GrumpyOldMenCollection/nLvUdqgPgm3F85NMCii9gVFUcet.jpg/hypTnLot2z8wpFS7qwsQHW1uV8u.jpgenEnglish19950.006194,19464WarnerBros.,LancasterGateUSUnitedStatesofAmerica1995-12-22diciembreviernes
331357Waiting to ExhaleCheatedon,mistreatedandsteppedon,thewomenareholdingtheirbreath,waitingfortheelusive"goodman"tobreakastringofless-than-stellarlovers.FriendsandconfidantsVannah,Bernie,GloandRobintalkitallout,determinedtofindabetterwaytobreathe.3.8594956.134.0Releaseden12716000000.081452156.0Friendsarethepeoplewholetyoubeyourself...andneverletyouforgetit.NaNNaNNaNNaNenEnglish19955.09306TwentiethCenturyFoxFilmCorporationUSUnitedStatesofAmerica1995-12-22diciembreviernes
411862Father of the Bride Part IIJustwhenGeorgeBankshasrecoveredfromhisdaughter'swedding,hereceivesthenewsthatshe'spregnant...andthatGeorge'swife,Nina,isexpectingtoo.Hewasplanningonsellingtheirhome,butthat'saplanthat--likeGeorge--willhavetochangewiththearrivalofbothagrandchildandakidofhisown.8.3875195.7173.0Releaseden1060.076578911.0JustWhenHisWorldIsBackToNormal...He'sInForTheSurpriseOfHisLife!96871.0FatheroftheBrideCollection/nts4iOmNnq7GNicycMJ9pSAn204.jpg/7qwE57OVZmMJChBpLEbJEmzUydk.jpgenEnglish19950.005842,9195SandollarProductions,TouchstonePicturesUSUnitedStatesofAmerica1995-02-10febreroviernes
5949HeatObsessivemasterthief,NeilMcCauleyleadsatop-notchcrewonvariousinsaneheiststhroughoutLosAngeleswhileamentallyunstabledetective,VincentHannapursueshimwithoutrest.Eachmanrecognizesandrespectstheabilityandthededicationoftheothereventhoughtheyareawaretheircat-and-mousegamemayendinviolence.17.9249277.71886.0Releaseden17060000000.0187436818.0ALosAngelesCrimeSagaNaNNaNNaNNaNen,esEnglish,Español19953.12508,675,6194RegencyEnterprises,ForwardPass,WarnerBros.USUnitedStatesofAmerica1995-12-15diciembreviernes
611860SabrinaAnuglyducklinghavingundergonearemarkablechange,stillharborsfeelingsforhercrush:acarefreeplayboy,butnotbeforehisbusiness-focusedbrotherhassomethingtosayaboutit.6.6772776.2141.0Releaseden12758000000.00.0Youarecordiallyinvitedtothemostsurprisingmergeroftheyear.NaNNaNNaNNaNfr,enFrançais,English19950.004,258,932,5842,14941,55873,58079ParamountPictures,ScottRudinProductions,MirageEnterprises,SandollarProductions,ConstellationEntertainment,Worldwide,MontBlancEntertainmentGmbHDE,USGermany,UnitedStatesofAmerica1995-12-15diciembreviernes
745325Tom and HuckAmischievousyoungboy,TomSawyer,witnessesamurderbythedeadlyInjunJoe.TombecomesfriendswithHuckleberryFinn,aboywithnofutureandnofamily.Tomhastochoosebetweenhonoringafriendshiporhonoringanoathbecausethetownalcoholicisaccusedofthemurder.TomandHuckgothroughseveraladventurestryingtoretrieveevidence.2.5611615.445.0Releaseden970.00.0TheOriginalBadBoys.NaNNaNNaNNaNen,deEnglish,Deutsch19950.002WaltDisneyPicturesUSUnitedStatesofAmerica1995-12-22diciembreviernes
89091Sudden DeathInternationalactionsuperstarJeanClaudeVanDammeteamswithPowersBootheinaTension-packed,suspensethriller,setagainsttheback-dropofaStanleyCupgame.VanDammeportraysafatherwhosedaughterissuddenlytakenduringachampionshiphockeygame.Withthecaptorsdemandingabilliondollarsbygame'send,VanDammefranticallysetsaplaninmotiontorescuehisdaughterandabortanimpendingexplosionbeforethefinalbuzzer...5.2315805.5174.0Releaseden10635000000.064350171.0Terrorgoesintoovertime.NaNNaNNaNNaNenEnglish19951.8433,21437,23770UniversalPictures,ImperialEntertainment,SignatureEntertainmentUSUnitedStatesofAmerica1995-12-22diciembreviernes
9710GoldenEyeJamesBondmustunmaskthemysteriousheadoftheJanusSyndicateandpreventtheleaderfromutilizingtheGoldenEyeweaponssystemtoinflictdevastatingrevengeonBritain.14.6860366.61194.0Releaseden13058000000.0352194034.0Nolimits.Nofears.Nosubstitutes.645.0JamesBondCollection/HORpg5CSkmeQlAolx3bKMrKgfi.jpg/6VcVl48kNKvdXOZfJPdarlUGOsk.jpgen,ru,esEnglish,Pусский,Español19956.0760,7576UnitedArtists,EonProductionsGB,USUnitedKingdom,UnitedStatesofAmerica1995-11-16noviembrejueves
idtitleoverviewpopularityvote_averagevote_countstatusoriginal_languageruntimebudgetrevenuetaglineid_btcname_btcposter_btcbackdrop_btciso_639_1language_namerelease_yearreturncompanies_idcompanies_namecountries_isocountries_namerelease_datemonth_timeday_time
4533667179St. Michael Had a RoosterSentencedtolifeimprisonmentforillegalactivities,ItalianInternationalmemberGiulioManieriholdsontohispoliticalidealswhilestrugglingagainstmadnessinthelonelinessofhisprisoncell.0.2250516.03.0Releasedit900.00.0NaNNaNNaNNaNNaNitItaliano19720.0NaNNaNNaNNaN1972-01-01enerosabado
4533784419House of HorrorsAnunsuccessfulsculptorsavesamadmannamed"TheCreeper"fromdrowning.Seeinganopportunityforrevenge,hetricksthepsychointomurderinghiscritics.0.2228146.38.0Releaseden650.00.0Meet...TheCREEPER!NaNNaNNaNNaNenEnglish19460.033UniversalPicturesUSUnitedStatesofAmerica1946-03-29marzoviernes
45338390959Shadow of the Blair WitchInthistrue-crimedocumentary,wedelveintothemurderspreethatwastheinspirationforJoeBerlinger's"BookofShadows:BlairWitch2".0.0760617.02.0Releaseden450.00.0NaNNaNNaNNaNNaNenEnglish20000.0NaNNaNNaNNaN2000-10-22octubredomingo
45339289923The Burkittsville 7AfilmarchivistrevisitsthestoryofRustinParr,ahermitthoughttohavemurderedsevenchildrenwhileunderthepossessionoftheBlairWitch.0.3864507.01.0Releaseden300.00.0Doyouknowwhathappened50yearsbefore"TheBlairWitchProject"?NaNNaNNaNNaNenEnglish20000.027570,27571NeptuneSaladEntertainment,PirieProductionsUSUnitedStatesofAmerica2000-10-03octubremartes
45340222848Caged Heat 3000It'stheyear3000AD.Theworld'smostdangerouswomenarebanishedtoaremoteasteroid45millionlightyearsfromearth.KiraMurphydoesn'tbelong;wrongfullyaccusedofacrimeshedidnotcommit,she'sthrowninthisinterplanetaryprisonandlefttoherowndefenses.ButKira'safighter,andsoonshefindsherselfinthemiddleofafemalegangwar;whereeveryonewantsapieceoftheaction...andapieceofher!"CagedHeat3000"takestheWomen-in-Prisongenretoawholenewlevel...andawholenewgalaxy!0.6615583.51.0Releaseden850.00.0NaNNaNNaNNaNNaNenEnglish19950.04688Concorde-NewHorizonsUSUnitedStatesofAmerica1995-01-01enerodomingo
4534130840Robin HoodYetanotherversionoftheclassicepic,withenoughvariationtomakeitinteresting.Thestoryisthesame,butsomeofthecharactersarequitedifferentfromtheusual,inparticularUmaThurman'sveryspecialmaidMarian.Thephotographyisalsogreat,givingthestoryasomewhatdarkertone.5.6837535.726.0Releaseden1040.00.0NaNNaNNaNNaNNaNenEnglish19910.07025,10163,16323,38978WestdeutscherRundfunk(WDR),WorkingTitleFilms,20thCenturyFoxTelevision,CanWestGlobalCommunicationsCA,DE,GB,USCanada,Germany,UnitedKingdom,UnitedStatesofAmerica1991-05-13mayolunes
45342111109Century of BirthingAnartiststrugglestofinishhisworkwhileastorylineaboutacultplaysinhishead.0.1782419.03.0Releasedtl3600.00.0NaNNaNNaNNaNNaNtlNaN20110.019653SineOliviaPHPhilippines2011-11-17noviembrejueves
4534367758BetrayalWhenoneofherhitsgoeswrong,aprofessionalassassinendsupwithasuitcasefullofamilliondollarsbelongingtoamobboss...0.9030073.86.0Releaseden900.00.0Adeadlygameofwits.NaNNaNNaNNaNenEnglish20030.06165AmericanWorldPicturesUSUnitedStatesofAmerica2003-08-01agostoviernes
45344227506Satan TriumphantInasmalltownlivetwobrothers,oneaministerandtheotheroneahunchbackpainterofthechapelwholiveswithhiswife.Onedreadfulandstormynight,astrangerknocksatthedooraskingforshelter.Thestrangertalksaboutallthegoodthingsoftheearthlylifetheministerismissingbecauseofhispuritanicalfaith.Theministercomestoacceptthestranger'sviewpointbutitisotherswhowillpaytheconsequencesbecausetheministerwilldiscoverthehumanpleasuresthanksto,ehem,hissister-in-law…Thetormentedministerandhiscuckoldedbrotherwilldieinastrangeaccidentinthechapelandlateraninfantwillbebornfromtheminister'sadulterousrelationship.0.0035030.00.0Releaseden870.00.0NaNNaNNaNNaNNaNNaNNaN19170.088753YermolievRURussia1917-10-21octubredomingo
45345461257Queerama50yearsafterdecriminalisationofhomosexualityintheUK,directorDaisyAsquithminesthejewelsoftheBFIarchivetotakeusintotherelationships,desires,fearsandexpressionsofgaymenandwomeninthe20thcentury.0.1630150.00.0Releaseden750.00.0NaNNaNNaNNaNNaNenEnglish20170.0NaNNaNGBUnitedKingdom2017-06-09junioviernes